Biochemical information processing apparatus, biochemical information processing method, and biochemical information recording medium

ABSTRACT

A biochemical information processing apparatus includes a storage means storing biochemical information, input means for accepting input of data, reaction scheme detection means for detecting a chemical reaction scheme involving a compound, based on the data, and display means for displaying a reaction scheme diagram of the chemical reaction scheme. The storage means includes a compound information file, an enzyme information file, and a relation information file. The relation information file stores a list showing the relation among compound numbers of compounds, enzyme numbers of enzymes with either pertinent compound being a substrate, and enzyme numbers of enzymes with either pertinent compound being a product. The reaction scheme detection means includes a first process portion for preparing canonical data of the compound from the data and searching the compound information file based thereon to read out a compound number. It also includes a second process portion for reading an enzyme number of an enzyme with the compound being a substrate or a product, a third process portion for reading a compound number of another compound constituting a reaction system with the enzyme and additional information of the enzyme, and a fourth process portion for indicating a reaction scheme diagram of the compound on the display means.

TECHNICAL FIELD

The present invention relates to a processing apparatus and processingmethod for processing information in the biochemical field, and moreparticularly, to a processing apparatus and processing method that cansearch for a reaction path of a bio-related compound and continuouslydisplay the reaction path and that can obtain information concerningbio-related substances.

Further, the present invention concerns an information recording medium(computer program product), such as a flexible disk or a magnetic tape,in which biochemical information is recorded, and more particularly, theinvention concerns an information recording medium having records ofinformation for searching for a reaction path of a bio-related compound,information for continuously displaying the reaction path, informationconcerning the bio-related substances, and so on.

BACKGROUND ART

Compound database systems and programs storing compound information andreaction database systems and programs storing reaction information ofcompound have been developed heretofore. The compound database systemsand programs store the compound information such as the physicalproperties and action of the existing compounds, and access is made tothe compound information with the structure of a compound as a key. Thereaction database systems store the reaction information of the existingcompounds, and access is made to the reaction information with thestructure of a compound as a key.

An example of such a compound database is “MACCS” which is a compoundcontrol system available from MDL Inc., Co., the United States. Examplesof the reaction database systems include the integrated chemicalinformation control system “ISIS” and reaction information controlsystem “REACCS” available from MDL Inc., Co., the United States.

There are, however, no conventional compound/reaction database systemsstoring the relationship between compound and enzyme and the informationconcerning the bio-related substances in an integrated manner. Becauseof it, using the structure of a compound as a key, one was unable toefficiently obtain the information concerning the enzymes or thebiochemical information related to the enzymes, substrates, andproducts. Also, there are no conventional compound/reaction databasesystems including a reaction path of plural compounds constructed in anintegrated manner. It was, therefore, not possible to efficiently searchfor the reaction path involving a plurality of compounds.

Further, there are no conventional compound/reaction database systemscollectively storing information concerning receptors existing forcontrol of bio-function or for transmission of information in vivo, andthe information concerning the bio-related substances (agonists andantagonists). It was, therefore, not possible to efficiently obtain thebiochemical information related to the receptors, agonists, andantagonists.

An object of the present invention is to provide a biochemicalinformation processing apparatus, biochemical information processingmethod, and information recording medium (computer program product),solving the above problems, which can permit one, even in the case ofthe structure of a compound being used as a key, to efficiently obtainthe information concerning the enzymes or the biochemical informationrelated to the enzymes, substrates, and products, which can permit oneto efficiently search for a reaction path involving a plurality ofcompounds, and which can permit one to efficiently obtain thebiochemical information related to the receptors, agonists, andantagonists.

DISCLOSURE OF INVENTION

First explained is the biochemical information processing apparatus ofthe present invention.

The biochemical information processing apparatus of the presentinvention is a biochemical information processing apparatus comprising

storage means for storing biochemical information about compounds andenzymes,

input means for accepting input of image data indicating saidbiochemical information or symbolic data indicating said biochemicalinformation,

reaction scheme detection means for, when said input means accepts dataabout a compound being a substrate and/or a product, detecting achemical reaction scheme involving said compound, based on the data, and

display means for indicating at least a reaction scheme diagram of thechemical reaction scheme;

wherein said storage means comprises

a compound information file storing a list showing the relation betweencompound numbers of the compounds and canonical data corresponding tosaid compounds, and additional information about said compounds,

an enzyme information file storing a list showing the relation amongenzyme numbers of the enzymes, compound numbers of compounds beingsubstrates for said enzymes, and compound numbers of compounds beingproducts by said enzymes, and additional information about said enzymes,and

a relation (correlation) information file storing a list showing therelation among compound numbers of compounds as a key, enzyme numbers ofenzymes with either said compound being a substrate, and enzyme numbersof enzymes with either said compound being a product; and

wherein said reaction scheme detection means comprises

a first process portion for preparing from the data about a compoundaccepted through said input means said canonical data uniquelyindicating a chemical structure of said compound, further searching saidcompound information file, based on the canonical data, and therebyreading out a compound number corresponding to said canonical data whensaid canonical data exists in said compound information file,

a second process portion for reading an enzyme number of an enzyme withthe compound being a substrate or a product out of said relationinformation file, based on the compound number read out in said firstprocess portion,

a third process portion for reading a compound number of anothercompound constituting a reaction system together with the enzyme of theenzyme number read out in said second process portion and said compound,and additional information about said enzyme out of said enzymeinformation file, and

a fourth process portion for indicating a reaction scheme diagram of thecompound accepted through said input means on said display means fromthe compound number read out in said first process portion, the enzymenumber read out in said second process portion, and the compound numberof the another compound read out in said third process portion, andfurther indicating the additional information about the enzyme read outin said third process portion on said display means.

With the above biochemical information processing apparatus of thepresent invention, when the data about the compound accepted through theinput means is supplied to the first process portion, the canonical datais prepared from this data. Then the compound information file issearched based on the canonical data thus prepared, and if the canonicaldata exists in the compound information file, a compound numbercorresponding to the canonical data is read out thereof. The compoundnumber read out in the first process portion is supplied to the secondprocess portion, and the second process portion reads an enzyme numberof an enzyme with this compound being a substrate or a compound out ofthe relation information file.

The enzyme number read out in the second process portion is supplied tothe third process portion, and the third process portion reads acompound number of another compound constituting a reaction systemtogether with the enzyme and the foregoing compound, and additionalinformation about the enzyme out of the enzyme information file. Thenthe compound number read out in the first process portion, the enzymenumber read out in the second process portion, and the compound numberof the another compound read out in the third process portion aresupplied to the fourth process portion, and the fourth process portionlets the display means indicate a reaction scheme diagram of thecompound accepted through the input means. Similarly, the additionalinformation about the enzyme read out in the third process portion isalso indicated on the display means.

The biochemical information processing apparatus of the presentinvention may further comprise receptor information detection means for,when said input means accepts data about a compound, detectingadditional information about a receptor with said compound being anagonist and/or an antagonist, based on the data, and in this case;

said storage means further stores biochemical information aboutreceptors, and

further comprises a receptor information file storing a list showing therelation between receptor numbers of the receptors and compound numbersof compounds being agonists and/or antagonists for said receptors, andadditional information about said receptors;

said relation information file stores a list to show the relation amongthe compound numbers of the compounds as a key, the enzyme numbers ofthe enzymes with either said compound being a substrate, the enzymenumbers of the enzymes with either said compound being a product, thereceptor numbers of the receptors with either said compound being anagonist, and the receptor numbers of the receptors with either saidcompound being an antagonist; and

said receptor information detection means comprises

a fifth process portion for preparing from data about a compoundaccepted through said input means said canonical data uniquelyindicating a chemical structure of said compound, further searching saidcompound information file, based on said canonical data, and therebyreading out a compound number corresponding to said canonical data whensaid canonical data exists in said compound information file,

a sixth process portion for reading, based on the compound number readout in said fifth process portion, a receptor number of a receptor withthe compound being an agonist or an antagonist out of said relationinformation file,

a seventh process portion for reading at least additional informationabout the receptor of the receptor number read out in said sixth processportion out of said receptor information file, and

an eighth process portion for indicating at least the additionalinformation about the receptor read out in said seventh process portionon said display means.

In this case, in the biochemical information processing apparatus of thepresent invention, when the data about the compound accepted through theinput means is supplied to the fifth process portion, canonical data isprepared from this data. Then the compound information file is searchedbased on the canonical data thus prepared, and if the canonical dataexists in the compound information file, a compound number correspondingto the canonical data is read out thereof. The compound number read outin the fifth process portion is supplied to the sixth process portion,and the sixth process portion reads a receptor number of a receptor withthis compound being an agonist or an antagonist out of the relationinformation file. The receptor number read out in the sixth processportion is supplied to the seventh process portion, and the seventhprocess portion reads at least the additional information about thereceptor out of the receptor information file. Then at least theadditional information about the receptor read out in the seventhprocess portion is supplied to the eighth process portion, and theeighth process portion lets the display means indicate at least theadditional information about the receptor.

Also, the biochemical information processing apparatus of the presentinvention may further comprise reaction path detection means for, whensaid input means accepts data about a predetermined compound selectedfrom a plurality of compounds constituting a reaction path, detectingthe reaction path of said plurality of compounds, based on the data, andin this case;

said reaction path detection means comprises

a ninth process portion for preparing from the data about the compoundaccepted through said input means said canonical data uniquelyindicating a chemical structure of said compound, further searching saidcompound information file, based on the canonical data, and therebyreading out a compound number corresponding to said canonical data whensaid canonical data exists in said compound information file,

a tenth process portion for reading, based on the compound number readout in said ninth process portion, an enzyme number of an enzyme withthe compound being a substrate and an enzyme number of an enzyme withthe compound being a product out of said relation information file,

an eleventh process portion for reading, based on each enzyme numberread out in said tenth process portion, a compound number of a compoundbeing a substrate for said enzyme and a compound number of a compoundbeing a product by said enzyme out of said enzyme information file,

a twelfth process portion for repeating a process by said tenth processportion and a process by said eleventh process portion to obtaincompounds and enzymes within the predetermined reaction path, and

a thirteenth process portion for indicating from enzyme numbers read outin said tenth process portion and compound numbers read out in saideleventh process portion a reaction scheme diagram of these compoundsalong the reaction path on said display means.

In this case, in the biochemical information processing apparatus of thepresent invention, when the data about the compound accepted through theinput means is supplied to the ninth process portion, canonical data isprepared from this data. Then the compound information file is searchedbased on the canonical data thus prepared, and if the canonical dataexists in the compound information file, a compound number correspondingto the canonical data is read out thereof. The compound number read outin the ninth process portion is supplied to the tenth process portion,and the tenth process portion reads an enzyme number of an enzyme withthe compound being a substrate and an enzyme number of an enzyme withthe compound being a product out of the relation information file.

Each enzyme number read out in the tenth process portion is supplied tothe eleventh process portion, and the eleventh process portion reads acompound number of a compound being a substrate for the enzyme and acompound number of a compound being a product by the enzyme out of theenzyme information file. The processes of the tenth process portion andthe eleventh process portion are repeated in the twelfth processportion.

Then the enzyme numbers read out in the tenth process portion and thecompound numbers read out in the eleventh process portion are suppliedto the thirteenth process portion, and the thirteenth process portionlets the display means indicate a reaction scheme diagram of thesecompounds along a predetermined reaction path.

Further, the biochemical information processing apparatus of the presentinvention may be the following one. Namely, the apparatus is abiochemical information processing apparatus comprising

storage means for storing biochemical information about compounds andenzymes,

input means for accepting input of image data indicating saidbiochemical information or symbolic data indicating said biochemicalinformation,

reaction path detection means for, when said input means accepts dataabout a predetermined compound selected from a plurality of compoundsconstituting a reaction path, detecting the reaction path of saidplurality of compounds, based on the data, and

display means for indicating at least a reaction scheme diagram of thechemical reaction scheme;

wherein said storage means comprises

a compound information file storing a list showing the relation betweencompound numbers of the compounds and canonical data corresponding tosaid compounds, and additional information about said compounds,

an enzyme information file storing a list showing the relation amongenzyme numbers of the enzymes, compound numbers of compounds beingsubstrates for said enzymes, and compound numbers of compounds beingproducts by said enzymes, and additional information about said enzymes,and

a relation (correlation) information file storing a list showing therelation among compound numbers of compounds as a key, enzyme numbers ofenzymes with either said compound being a substrate, and enzyme numbersof enzymes with either said compound being a product; and

wherein said reaction path detection means comprises

a ninth process portion for preparing from the data about the compoundaccepted through said input means said canonical data uniquelyindicating a chemical structure of said compound, further searching saidcompound information file, based on the canonical data, and therebyreading out a compound number corresponding to said canonical data whensaid canonical data exists in said compound information file,

a tenth process portion for reading, based on the compound number readout in said ninth process portion, an enzyme number of an enzyme withthe compound being a substrate and an enzyme number of an enzyme withthe compound being a product out of said relation information file,

an eleventh process portion for reading, based on each enzyme numberread out in said tenth process portion, a compound number of a compoundbeing a substrate for said enzyme and a compound number of a compoundbeing a product by said enzyme out of said enzyme information file,

a twelfth process portion for repeating a process by said tenth processportion and a process by said eleventh process portion to obtaincompounds and enzymes within the predetermined reaction path, and

a thirteenth process portion for indicating from the enzyme numbers readout in said tenth process portion and compound numbers read out in saideleventh process portion a reaction scheme diagram of these compoundsalong the reaction path on said display means.

In this case, the biochemical information processing apparatus of thepresent invention may further comprise receptor information detectionmeans for, when said input means accepts data about a compound,detecting additional information about a receptor with said compoundbeing an agonist and/or an antagonist, based on the data, and in thiscase;

said storage means further stores biochemical information aboutreceptors, and

further comprises a receptor information file storing a list showing therelation between receptor numbers of the receptors and compound numbersof compounds being agonists and/or antagonists for said receptors, andadditional information about said receptors;

said relation information file stores a List to show the relation amongthe compound numbers of the compounds as a key, the enzyme numbers ofthe enzymes with either said compound being a substrate, the enzymenumbers of the enzymes with either said compound being a product, thereceptor numbers of the receptors with either said compound being anagonist, and the receptor numbers of the receptors with either saidcompound being an antagonist; and

said receptor information detection means comprises

a fifth process portion for preparing from data about a compoundaccepted through said input means said canonical data uniquelyindicating a chemical structure of said compound, further searching saidcompound information file, based on said canonical data, and therebyreading out a compound number corresponding to said canonical data whensaid canonical data exists in said compound information file,

a sixth process portion for reading, based on the compound number readout in said fifth process portion, a receptor number of a receptor withthe compound being an agonist or an antagonist out of said relationinformation file,

a seventh process portion for reading at least additional informationabout the receptor of the receptor number read out in said sixth processportion out of said receptor information file, and

an eighth process portion for indicating at least the additionalinformation about the receptor read out in said seventh process portionon said display means.

Further, in the biochemical information processing apparatus of thepresent invention, preferably,

said input means accepts input of characteristic data about each ofatoms constituting a compound and bonding pair data between atoms; and

said biochemical information processing apparatus preferably furthercomprises the following canonical data preparation means for preparingcanonical data capable of uniquely specifying a chemical structure ofsaid compound, based on each data accepted through said input means.Namely, said canonical data preparation means comprises

a constituent atom classification process portion for classifying, basedon each data accepted through said input means, the atoms into differentclasses each for equivalent atoms and assigning, to each atom, adifferent class number for each class,

a canonical number assignment process portion for assigning canonicalnumbers uniquely corresponding to the structure of said compound to therespective atoms, based on the class numbers assigned to the respectiveatoms in said constituent atom classification process portion, and

a canonical data preparation process portion for preparing saidcanonical data, based on the canonical numbers assigned to therespective atoms in said canonical number assignment process portion.

With the canonical data preparation means according to the presentinvention having the above structure, the characteristic data about eachatom and bonding pair data between atoms accepted through the inputmeans is supplied to the canonical data preparation means. Then thecanonical data preparation means prepares the canonical data, based onthese data.

Namely, the canonical data preparation means first carries out theprocess of constituent atom classification process portion to classifythe atoms into different classes each for equivalent atoms, based on thecharacteristic data about each atom and the bonding pair data betweenatoms. Then class numbers of respective classes different from eachother are assigned to the respective atoms. Next, the process ofcanonical number assignment process portion is carried out to assigncanonical numbers uniquely corresponding to the structure of thecompound to the respective atoms, based on the class numbers assigned tothe respective atoms and the bonding pair data between atoms. Further,the process of canonical data preparation process portion is carried outto prepare the canonical data based on the canonical numbers assigned tothe respective atoms and the characteristic data about the respectiveatoms.

Here, preferably, said constituent atom classification process portionassigns three types of attributes (a_(i), b_(ij), d_(ij)) to each atomand, utilizing the fact that atoms different in even only one of theseattributes can be determined to be not equivalent, assigns a differentclass number for each equivalent atom to each atom,

where among said three types of attributes (a_(i), b_(ij), d_(ij)),a_(i) is a kind number of an atom of input number i, b_(ij) is thenumber of bonds adjoining the atom of input number i and having a bondkind number being j, and d_(ij) is the number of routes that can betraced from the atom of input number i through j bonds in the shortestpath;

said canonical number assignment process portion is arranged so thatwhen in a process for assigning a canonical number to each atom in theascending order from 1 the canonical number 1 is given to an atom with ahighest priority of said class number and thereafter canonical numbersup to the canonical number n are assigned in that manner, said canonicalnumber assignment process portion selects an atom with a minimumcanonical number out of atoms already having their respective canonicalnumbers and bonding to an atom having no canonical number yet and thengives a canonical number n+1 to an atom with a highest priority of saidclass number out of atoms bonding to said selected atom and having nocanonical number yet; and

said canonical data preparation process portion gives three types ofattributes (P_(i), T_(i), S_(i)) to each atom and aligns theseattributes in line to prepare said canonical data,

where among said three types of attributes (P_(i), T_(i), S_(i)), P_(i)is a canonical number of an atom bonding to an atom of canonical numberi and having a minimum canonical number, T_(i) is a symbol for a type ofa bond between the atom of canonical number i and the atom of canonicalnumber P_(i), and S_(i) is a symbol for a kind of the atom of canonicalnumber i.

Next explained is the biochemical information processing method of thepresent invention.

The biochemical information processing method of the present inventionis a biochemical information processing method using an informationprocessing apparatus comprising

storage means for storing biochemical information about compounds andenzymes,

input means for accepting input of image data indicating saidbiochemical information or symbolic data indicating said biochemicalinformation, and

display means for indicating at least a reaction scheme diagram of achemical reaction scheme;

wherein said storage means comprises

a compound information file storing a list showing the relation betweencompound numbers of the compounds and canonical data corresponding tosaid compounds, and additional information about said compounds,

an enzyme information file storing a list showing the relation amongenzyme numbers of the enzymes, compound numbers of compounds beingsubstrates for said enzymes, and compound numbers of compounds beingproducts by said enzymes, and additional information about said enzymes,and

a relation (correlation) information file storing a list showing therelation among compound numbers of compounds as a key, enzyme numbers ofenzymes with either said compound being a substrate, and enzyme numbersof enzymes with either said compound being a product; and

wherein said biochemical information processing method comprises

a first step for, when said input means accepts data about a compoundbeing a substrate and/or a product, preparing said canonical datauniquely indicating a chemical structure of said compound from the data,further searching said compound information file, based on the canonicaldata, and thereby reading out a compound number corresponding to saidcanonical data when said canonical data exists in said compoundinformation file,

a second step for reading an enzyme number of an enzyme with thecompound being a substrate or a product out of said relation informationfile, based on the compound number read out in said first step,

a third step for reading a compound number of another compoundconstituting a reaction system together with the enzyme of the enzymenumber read out in said second step and said compound, and additionalinformation about said enzyme out of said enzyme information file, and

a fourth step for indicating a reaction scheme diagram of the compoundaccepted through said input means on said display means from thecompound number read out in said first step, the enzyme number read outin said second step, and the compound number of the another compoundread out in said third step, and further indicating the additionalinformation about the enzyme read out in said third step on said displaymeans.

With the above biochemical information processing method of the presentinvention, the processes of the first step to the fourth step enable todetect a reaction scheme. In the detection of reaction scheme, first,the process of the first step is carried out to prepare canonical datafrom the data about the compound accepted through the input means. Thenthe compound information file is searched based on the canonical datathus prepared, and if the canonical data exists in the compoundinformation file, a compound number corresponding to the canonical datais read out thereof. Next, the process of the second step is carried outto read out an enzyme number of an enzyme with the compound being asubstrate or a product out of the relation information file, based onthe compound number read out in the first step.

Further, the process of the third step is carried out to read a compoundnumber of another compound constituting a reaction system together withthe enzyme of the enzyme number read out in the second step and thecompound, and the additional information about the enzyme out of theenzyme information file. Then the process of the fourth step is carriedout to indicate the reaction scheme diagram of the compound acceptedthrough the input means on the display means from the compound numberread out in the first step, the enzyme number read out in the secondstep, and the compound number of the another compound read out in thethird step. Similarly, the additional information about the enzyme readout in the third step is also indicated on the display means.

In the biochemical information processing method of the presentinvention,

said storage means may further store biochemical information about areceptor, and

may further comprise a receptor information file storing a list showingthe relation between receptor numbers of the receptors and compoundnumbers of compounds being agonists and/or antagonists for saidreceptors, and additional information about said receptors, and in thiscase;

said relation information file stores a list to show the relation amongthe compound numbers of the compounds as a key, the enzyme numbers ofthe enzymes with either said compound being a substrate, the enzymenumbers of the enzymes with either said compound being a product, thereceptor numbers of the receptors with either said compound being anagonist, and the receptor numbers of the receptors with either saidcompound being an antagonist; and

said biochemical information processing method further comprises

a fifth step for, when said input means accepts data about a compound,preparing said canonical data uniquely indicating a chemical structureof said compound from the data, further searching said compoundinformation file, based on said canonical data, and thereby reading outa compound number corresponding to said canonical data when saidcanonical data exists in said compound information file,

a sixth step for reading, based on the compound number read out in saidfifth step, a receptor number of a receptor with the compound being anagonist or an antagonist out of said relation information file,

a seventh step for reading at least additional information about thereceptor of the receptor number read out in said sixth step out of saidreceptor information file, and

an eighth step for indicating at least the additional information aboutthe receptor read out in said seventh step on said display means.

In this case, in the biochemical information processing method of thepresent invention, the processes of the fifth step to the eighth stepenable to detect receptor information. In the detection of receptorinformation, first, the process of the fifth step is carried out toprepare canonical data from the data about the compound accepted throughthe input means. Then the compound information file is searched based onthe canonical data prepared, and if the canonical data exists in thecompound information file, a compound number corresponding to thecanonical data is read out thereof. Next, the process of the sixth stepis carried out to read a receptor number of a receptor with the compoundbeing an agonist or an antagonist, based on the compound number reactout in the fifth step, out of the relation information file. Further,the process of the seventh step is carried out to read at least theadditional information about the receptor of the receptor number readout in the sixth step out of the receptor information file. Then theprocess of the eighth step is carried out to display at least theadditional information about the receptor read out in the seventh stepon the display means.

The biochemical information processing method of the present inventionmay further comprise

a ninth step for, when said input means accepts data about apredetermined compound selected from a plurality of compoundsconstituting a reaction path, preparing said canonical data uniquelyindicating a chemical structure of said compound from the data, furthersearching said compound information file, based on the canonical data,and thereby reading out a compound number corresponding to saidcanonical data when said canonical data exists in said compoundinformation file,

a tenth step for reading, based on the compound number read out in saidninth step, an enzyme number of an enzyme with the compound being asubstrate and an enzyme number of an enzyme with the compound being aproduct out of said relation information file,

an eleventh step for reading, based on each enzyme number read out insaid tenth step, a compound number of a compound being a substrate forsaid enzyme and a compound number of a compound being a product by saidenzyme out of said enzyme information file,

a twelfth step for repeating a process by said tenth step and a processby said eleventh step to obtain compounds and enzymes within thepredetermined reaction path, and

a thirteenth step for indicating from the enzyme numbers read out insaid tenth step and compound numbers read out in said eleventh step areaction scheme diagram of these compounds along the reaction path onsaid display means.

In this case, in the biochemical information processing method of thepresent invention, the processes of the ninth step to the twelfth stepenable to detect a reaction path. In the detection of reaction path,first, the process of the ninth step is carried out to prepare canonicaldata from the data about the predetermined compound accepted through theinput means. Then the chemical information file is searched based on thecanonical data thus prepared, and if the canonical data exists in thecompound information file, a compound number corresponding to thecanonical data is read out thereof. Next, the process of the tenth stepis carried out to read an enzyme number of an enzyme with this compoundbeing a substrate and an enzyme number of an enzyme with this compoundbeing a product, based on the compound number read out in the ninthstep, out of the relation information file. Further, the process of theeleventh step is carried out to read, based on each enzyme number readout in the tenth step, a compound number of a compound with this enzymebeing a substrate and a compound number of a compound with this enzymebeing a product out of the enzyme information file. The processes of thetenth step and the eleventh step are repeated in the twelfth step.

Then the process of the thirteenth step is carried out to indicate fromthe enzyme numbers read out in the tenth step and the compound numbersread out in the eleventh step the reaction scheme diagram of thesecompounds along a reaction path on the display means.

Further, the biochemical information processing method of the presentinvention may be the following one. Namely, the method may be abiochemical information processing method using an informationprocessing apparatus comprising

storage means for storing biochemical information about compounds andenzymes,

input means for accepting input of image data indicating saidbiochemical information or symbolic data indicating said biochemicalinformation, and

display means for indicating at least a reaction scheme diagram of achemical reaction scheme;

wherein said storage means comprises

a compound information file storing a list showing the relation betweencompound numbers of the compounds and canonical data corresponding tosaid compounds, and additional information about said compounds,

an enzyme information file storing a list showing the relation amongenzyme numbers of the enzymes, compound numbers of compounds beingsubstrates for said enzymes, and compound numbers of compounds beingproducts by said enzymes, and additional information about said enzymes,and

a relation (correlation) information file storing a list showing therelation among compound numbers of compounds as a key, enzyme numbers ofenzymes with either said compound being a substrate, and enzyme numbersof enzymes with either said compound being a product; and

wherein said biochemical information processing method comprises

a ninth step for, when said input means accepts data about apredetermined compound selected from a plurality of compoundsconstituting a reaction path, preparing said canonical data uniquelyindicating a chemical structure of said compound from the data, furthersearching said compound information file, based on the canonical data,and thereby reading out a compound number corresponding to saidcanonical data when said canonical data exists in said compoundinformation file,

a tenth step for reading, based on the compound number read out in saidninth step, an enzyme number of an enzyme with the compound being asubstrate and an enzyme number of an enzyme with the compound being aproduct out of said relation information file,

an eleventh step for reading, based on each enzyme number read out insaid tenth step, a compound number of a compound being a substrate forsaid enzyme and a compound number of a compound being a product by saidenzyme out of said enzyme information file,

a twelfth step for repeating a process by said tenth step and a processby said eleventh step to obtain compounds and enzymes within thepredetermined reaction path, and

a thirteenth step for indicating from enzyme numbers read out in saidtenth step and compound numbers read out in said eleventh step areaction scheme diagram of these compounds along the reaction path onsaid display means.

In this case, in the biochemical information processing method of thepresent invention,

said storage means may further store biochemical information aboutreceptors, and

may further comprise a receptor information file storing a list showingthe relation between receptor numbers of the receptors and compoundnumbers of compounds being agonists and/or antagonists for saidreceptors, and additional information about said receptors, and in thiscase;

said relation information file stores a list to show the relation amongthe compound numbers of the compounds as a key, the enzyme numbers ofthe enzymes with either said compound being a substrate, the enzymenumbers of the enzymes with either said compound being a product, thereceptor numbers of the receptors with either said compound being anagonist, and the receptor numbers of the receptors with either saidcompound being an antagonist; and

said biochemical information processing method further comprises

a fifth step for, when said input means accepts data about a compound,preparing said canonical data uniquely indicating a chemical structureof said compound from the data, further searching said compoundinformation file, based on said canonical data, and thereby reading outa compound number corresponding to said canonical data when saidcanonical data exists in said compound information file,

a sixth step for reading, based on the compound number read out in saidfifth step, a receptor number of a receptor with the compound being anagonist or an antagonist out of said relation information file,

a seventh step for reading at least additional information about thereceptor of the receptor number read out in said sixth step out of saidreceptor information file, and

an eighth step for indicating at least the additional information aboutthe receptor read out in said seventh step on said display means.

Further, in the biochemical information processing method of the presentinvention, preferably, said input means accepts input of characteristicdata about each of atoms constituting a compound and bonding pair databetween atoms; and

said biochemical information processing method further comprises

a constituent atom classification step for classifying, based on eachdata accepted through said input means, the atoms into different classeseach for equivalent atoms and assigning, to each atom, a different classnumber for each class,

a canonical number assignment step for assigning canonical numbersuniquely corresponding to the structure of said compound to therespective atoms, based on the class numbers assigned to the respectiveatoms in said constituent atom classification step, and

a canonical data preparation step for preparing said canonical dataenabling to uniquely specify a chemical structure of said compound,based on the canonical numbers assigned to the respective atoms in saidcanonical number assignment step.

By the various steps for preparing the canonical data according to thepresent invention having such structure, the canonical data is preparedbased on the characteristic data about each atom and the bonding pairdata between atoms accepted through the input means.

Namely, first, in the constituent atom classification step, the atomsare classified into different classes each for equivalent atoms, basedon the characteristic data about each atom and the bonding pair databetween atoms. Then a different class number for each class is assignedto each atom. Next, in the canonical number assignment step, thecanonical numbers uniquely corresponding to the structure of thecompound are assigned to the respective atoms, based on the classnumbers given to the respective atoms and the bonding pair data betweenatoms. Further, in the canonical data preparation step, the canonicaldata is prepared based on the canonical numbers given to the respectiveatoms and the characteristic data about each atom.

Here, preferably, said constituent atom classification step assignsthree types of attributes (a_(i), b_(ij), d_(ij)) to each atom and,utilizing the fact that atoms different in even only one of theseattributes can be determined to be not equivalent, assigns a differentclass number for each equivalent atom to each atom,

where among said three types of attributes (a_(i), b_(ij), d_(ij)),a_(i) is a kind number of an atom of input number i, b_(ij) is thenumber of bonds adjoining the atom of input number i and having a bondkind number being j, and d_(ij) is the number of routes that can betraced from the atom of input number i through j bonds in the shortestpath;

said canonical number assignment step is arranged so that when in aprocess for assigning a canonical number to each atom in the ascendingorder from 1 the canonical number 1 is given to an atom with a highestpriority of said class number and thereafter canonical numbers up to thecanonical number n are assigned in that manner, said canonical numberassignment step selects an atom with a minimum canonical number out ofatoms already having their respective canonical numbers and bonding toan atom having no canonical number yet and then gives a canonical numbern+1 to an atom with a highest priority of said class number out of atomsbonding to said selected atom and having no canonical number yet; and

said canonical data preparation step gives three types of attributes(P_(i), T_(i), S_(i)) to each atom and aligns these attributes in lineto prepare said canonical data,

where among said three types of attributes (P_(i), T_(i), S_(i)), P_(i)is a canonical number of an atom bonding to an atom of canonical numberi and having a minimum canonical number, T_(i) is a symbol for a type ofa bond between the atom of canonical number i and the atom of canonicalnumber P_(i), and S_(i) is a symbol for a kind of the atom of canonicalnumber i.

Next explained is the biochemical information computer program product(biochemical information recording medium) of the present invention.

The biochemical information computer program product of the presentinvention is a biochemical information computer program product usedwith an information processing apparatus comprising input means foraccepting input of image data indicating biochemical information orsymbolic data indicating biochemical information, display means forindicating at least a reaction scheme diagram of a chemical reactionscheme, and reading means for reading information out of acomputer-usable medium;

said computer program product comprising the computer-usable mediumhaving a file area for recording a file and a program area for recordinga program and having computer-readable file and program embodied in saidmedium, for letting at least a reaction scheme diagram efficiently besearched for and be indicated by said display means, based on data inputthrough said input means;

said computer program product having,

in said file area,

a computer-readable compound information file for storing a list showingthe relation between compound numbers of compounds and canonical datacorresponding to said compounds, and additional information about saidcompounds,

a computer-readable enzyme information file for storing a list showingthe relation among enzyme numbers of enzymes, compound numbers ofcompounds being substrates for said enzymes, and compound numbers ofcompounds being products by said enzymes, and additional informationabout said enzymes, and

a computer-readable relation (correlation) information file for storinga list showing the relation among the compound numbers of the compoundsas a key, enzyme numbers of enzymes with either said compound being asubstrate, and enzyme numbers of enzymes with either said compound beinga product, and

having, in said program area,

a computer-readable reaction scheme detection program for, when saidinput means accepts data about a compound being a substrate and/or aproduct, detecting a chemical reaction scheme involving said compound,based on the data;

wherein said reaction scheme detection program comprises

a first computer-readable process routine for preparing from the dataabout a compound accepted through said input means said canonical datauniquely indicating a chemical structure of said compound, furthersearching said compound information file, based on the canonical data,and thereby reading out a compound number corresponding to saidcanonical data when said canonical data exists in said compoundinformation file,

a second computer-readable process routine for reading an enzyme numberof an enzyme with the compound being a substrate or a product out ofsaid relation information file, based on the compound number read out insaid first process routine,

a third computer-readable process routine for reading a compound numberof another compound constituting a reaction system together with theenzyme of the enzyme number read out in said second process routine andsaid compound, and additional information about said enzyme out of saidenzyme information file, and

a fourth computer-readable process routine for indicating a reactionscheme diagram of the compound accepted through said input means on saiddisplay means from the compound number read out in said first processroutine, the enzyme number read out in said second process routine, andthe compound number of the another compound read out in said thirdprocess routine, and further indicating the additional information aboutthe enzyme read out in said third process routine on said display means.

In the above biochemical information computer program product of thepresent invention, the compound information file etc. are recorded inthe file area and the reaction scheme detection program is recorded inthe program area.

The reaction scheme detection program can be executed using theinformation processing apparatus. By this execution, first, the processof the first process routine is carried out to prepare the canonicaldata from the data about the compound accepted through the input means.Then the compound information file is searched based on the canonicaldata thus prepared, and if the canonical data exists in the compoundinformation file, a compound number corresponding to the canonical datais read out thereof.

Next, the process of the second process routine is carried out to readan enzyme number of an enzyme with this compound being a substrate or aproduct, based on the compound number read out in the first processroutine, out of the relation information file. Further, the process ofthe third process routine is carried out to read a compound number ofanother compound constituting a reaction system together with the enzymeof the enzyme number read out in the second process routine and thecompound, and the additional information about the enzyme out of theenzyme information file. Then the process of the fourth process routineis carried out to indicate the reaction scheme diagram of the compoundaccepted through the input means on the display means from the compoundnumber read out in the first process routine, the enzyme number read outin the second process routine, and the compound number of the anothercompound read out in the third process routine. Further, the additionalinformation about the enzyme read out in the third process routine isalso indicated on the display means.

The biochemical information computer program product of the presentinvention may further have, in said file area,

a computer-readable receptor information file storing a list showing therelation between receptor numbers of the receptors and compound numbersof compounds being agonists and/or antagonists for said receptors, andadditional information about said receptors;

said relation information file stores a list to show the relation amongthe compound numbers of the compounds as a key, the enzyme numbers ofthe enzymes with either said compound being a substrate, the enzymenumbers of the enzymes with either said compound being a product, thereceptor numbers of the receptors with either said compound being anagonist, and the receptor numbers of the receptors with either saidcompound being an antagonist; and

said computer program product further has, in said program area,

a computer-readable receptor information detection program for, whensaid input means accepts data about a compound, detecting additionalinformation about a receptor with said compound being an agonist and/oran antagonist, based on the data; and

said receptor information detection program comprises

a fifth computer-readable process routine for preparing from data abouta compound accepted through said input means said canonical datauniquely indicating a chemical structure of said compound, furthersearching said compound information file, based on said canonical data,and thereby reading out a compound number corresponding to saidcanonical data when said canonical data exists in said compoundinformation file,

a sixth computer-readable process routine for reading, based on thecompound number read cut in said fifth process routine, a receptornumber of a receptor with the compound being an agonist or an antagonistout of said relation information file,

a seventh computer-readable process routine for reading at leastadditional information about the receptor of the receptor number readout in said sixth process routine out of said receptor information file,and

an eighth computer-readable process routine for indicating at least theadditional information about the receptor read out in said seventhprocess routine on said display means.

In this case, in the above biochemical information computer programproduct of the present invention, the receptor information detectionprogram is recorded in addition to the reaction scheme detection programin the program area.

The receptor information detection program can be executed using theinformation processing apparatus.

By this execution, first, the process of the fifth process routine iscarried out to prepare the canonical data from the data about thecompound accepted through the input means. Then the compound informationfile is searched based on the canonical data thus prepared, and if thecanonical data exists in the compound information file, a compoundnumber corresponding to the canonical data is read out thereof.

Next, the process of the sixth process routine is carried out to read areceptor number of a receptor with this compound being an agonist or anantagonist, based on the compound number read out in the fifth processroutine, out of the relation information file. Further, the process ofthe seventh process routine is carried out to read at least theadditional information about the receptor of the receptor number readout in the sixth process routine out of the receptor information file.Then the process of the eighth process routine is carried out toindicate at least the additional information about the receptor read outin the seventh process routine on the display means.

The biochemical information computer program product of the presentinvention may further have, in said program area,

a computer-readable reaction path detection program for, when said inputmeans accepts data about a predetermined compound selected from aplurality of compounds constituting a reaction path, detecting thereaction path of said plurality of compounds, based on the data, and inthis case;

said reaction path detection program comprises

a ninth computer-readable process routine for preparing from the dataabout the compound accepted through said input means said canonical datauniquely indicating a chemical structure of said compound, furthersearching said compound information file, based on the canonical data,and thereby reading cut a compound number corresponding to saidcanonical data when said canonical data exists in said compoundinformation file,

a tenth computer-readable process routine for reading, based on thecompound number read cut in said ninth process routine, an enzyme numberof an enzyme with the compound being a substrate and an enzyme number ofan enzyme with the compound being a product out of said relationinformation file,

an eleventh computer-readable process routine for reading, based on eachenzyme number read out in said tenth process routine, a compound numberof a compound being a substrate for said enzyme and a compound number ofa compound being a product by said enzyme out of said enzyme informationfile,

a twelfth computer-readable process routine for repeating a process bysaid tenth process routine and a process by said eleventh processroutine to obtain compounds and enzymes within the predeterminedreaction path, and

a thirteenth computer-readable process routine for indicating fromenzymes numbers read out in said tenth process routine and compoundnumbers read out in said eleventh process routine a reaction schemediagram of these compounds along the reaction path on said displaymeans.

In this case, in the above biochemical information computer programproduct of the present invention, the reaction path detection program isrecorded in addition to the reaction scheme detection program and thereceptor information detection program in the program area.

The reaction path detection program can be executed using theinformation processing apparatus.

By this execution, first, the process of the ninth process routine iscarried out to prepare the canonical data from the data about thepredetermined compound accepted through the input means. Then thecompound information file is searched based on the canonical data thusprepared, and if the canonical data exists in the compound informationfile, a compound number corresponding to the canonical data is read outthereof.

Next, the process of the tenth process routine is carried out to read anenzyme number of an enzyme with the compound being a substrate and anenzyme number of an enzyme with the compound being a product, based onthe compound number read out in the ninth process routine, out of therelation information file. Further, the process of the eleventh processroutine is carried out to read, based on each enzyme number read out inthe tenth process routine, a compound number of a compound being asubstrate of the enzyme and a compound number of a compound being aproduct of the enzyme out of the enzyme information file. The processesof the tenth process routine and the eleventh process routine arerepeated in the twelfth process routine.

Then the process of the thirteenth process routine is carried out toindicate a reaction scheme diagram of these compounds along a reactionpath on the display means from the enzyme numbers read out in the tenthprocess routine and the compound numbers read out in the eleventhprocess routine.

Further, the biochemical information computer program product of thepresent invention may be the following one. Namely, the product may be abiochemical information computer program product used with aninformation processing apparatus comprising input means for acceptinginput of image data indicating biochemical information or symbolic dataindicating biochemical information, display means for indicating atleast a reaction scheme diagram of a chemical reaction scheme, andreading means for reading information out of a computer-usable medium;

said computer program product comprising the computer-usable mediumhaving a file area for recording a file and a program area for recordinga program and having computer-readable file and program embodied in saidmedium, for letting at least a reaction scheme diagram efficiently besearched for and be indicated by said display means, based on data inputthrough said input means;

said computer program product having,

in said file area,

a computer-readable compound information file for storing a list showingthe relation between compound numbers of compounds and canonical datacorresponding to said compounds, and additional information about saidcompounds,

a computer-readable enzyme information file for storing a list showingthe relation among enzyme numbers of enzymes, compound numbers ofcompounds being substrates for said enzymes, and compound numbers ofcompounds being products by said enzymes, and additional informationabout said enzymes, and

a computer-readable relation (correlation) information file for storinga list showing the relation among the compound numbers of the compoundsas a key, enzyme numbers of enzymes with either said compound being asubstrate, and enzyme numbers of enzymes with either said compound beinga product, and

having, in said program area,

a computer-readable reaction path detection program for, when said inputmeans accepts data about a predetermined compound selected from aplurality of compounds constituting a reaction path, detecting thereaction path of said plurality of compounds, based on the data;

wherein said reaction path detection program comprises

a ninth computer-readable process routine for preparing from the dataabout the compound accepted through said input means said canonical datauniquely indicating a chemical structure of said compound, furthersearching said compound information file, based on the canonical data,and thereby reading out a compound number corresponding to saidcanonical data when said canonical data exists in said compoundinformation file,

a tenth computer-readable process routine for reading, based on thecompound number read out in said ninth process routine, an enzyme numberof an enzyme with the compound being a substrate and an enzyme number ofan enzyme with the compound being a product out of said relationinformation file,

an eleventh computer-readable process routine for reading, based on eachenzyme number read out in said tenth process routine, a compound numberof a compound being a substrate for said enzyme and a compound number ofa compound being a product by said enzyme out of said enzyme informationfile,

a twelfth computer-readable process routine for repeating a process bysaid tenth process routine and a process by said eleventh processroutine to obtain compounds and enzymes within the predeterminedreaction path, and

a thirteenth computer-readable process routine for indicating fromenzyme numbers read out in said tenth process routine and compoundnumbers read out in said eleventh process routine a reaction schemediagram of these compounds along the reaction path on said displaymeans.

In this case, the biochemical information computer program product ofthe present invention may further have, in said file area,

a computer-readable receptor information file storing a list showing therelation between a receptor number of a receptor and a compound numberof a compound being an agonist and/or an antagonist of said receptor,and additional information about said receptor, and in this case;

said relation information file stores a list to show the relation amonga compound number of a compound as a key, an enzyme number of an enzymewith said compound being a substrate, an enzyme number of an enzyme withsaid compound being a product, a receptor number of a receptor with saidcompound being an agonist, and a receptor number of a receptor with saidcompound being an antagonist; and

said computer program product further has, in said program area,

a computer-readable receptor information detection program for, whensaid input means accepts data about a compound, detecting additionalinformation about a receptor with said compound being an agonist and/oran antagonist, based on the data; and

said receptor information detection program comprises

a fifth computer-readable process routine for preparing from data abouta compound accepted through said input means said canonical datauniquely indicating a chemical structure of said compound, searchingsaid compound information file, based on this canonical data, andreading out a compound number corresponding to said canonical data whensaid canonical data exists in said compound information file,

a sixth computer-readable process routine for reading, based on thecompound number read out in said fifth process routine, a receptornumber of a receptor with the compound being an agonist or an antagonistout of said relation information file,

a seventh computer-readable process routine for reading at leastadditional information about a receptor of the receptor number read outin said sixth process routine out of said receptor information file, and

an eighth computer-readable process routine for indicating at least theadditional information about the receptor read out in said seventhprocess routine on said display means.

Further, in the biochemical information computer program product of thepresent invention, preferably,

said input means accepts input of characteristic data about each ofatoms constituting a compound and bonding pair data between atoms; and

said computer program product further has, in said program area,

a computer-readable canonical data preparation program for preparingcanonical data capable of uniquely specifying a chemical structure ofsaid compound, based on each data accepted through said input means.Namely, said canonical data preparation program comprises

a computer-readable constituent atom classification routine forclassifying the atoms into different classes each for equivalent atomsand assigning, to each atom, a different class number for each class,

a computer-readable canonical number assignment routine for assigningcanonical numbers uniquely corresponding to the structure of saidcompound to the respective atoms, based on the class numbers assigned tothe respective atoms in said constituent atom classification routine,and

a computer-readable canonical data preparation routine for preparingsaid canonical data, based on the canonical numbers assigned to therespective atoms in said canonical number assignment routine.

By setting the biochemical information computer program productaccording to the present invention having such structure in apredetermined information processing apparatus and reading the canonicaldata preparation program stored in the program area, the canonical datapreparation program can be executed by the information processingapparatus. By start of the canonical data preparation program, theconstituent atom classification routine is first carried out to classifythe atoms into different classes each for equivalent atoms, based on thecharacteristic data about each atom and the bonding pair data betweenatoms. Then a different class number for each class is assigned to eachatom. Then the canonical number assignment routine is carried out toassign canonical numbers uniquely corresponding to the structure of thecompound to the respective atoms, based on the class numbers given tothe respective atoms and the bonding pair data between atoms. Further,the canonical data preparation routine is carried out to prepare thecanonical data based on the canonical numbers given to the respectiveatoms and the characteristic data about each atom.

Here, preferably, said constituent atom classification routine assignsthree types of attributes (a_(i), b_(ij), d_(ij)) to each atom and,utilizing the fact that atoms different in even only one of theseattributes can be determined to be not equivalent, assigns a differentclass number for each equivalent atom to each atom,

where among said three types of attributes (a_(i), b_(ij), d_(ij)),a_(i) is a kind number of an atom of input number i, b_(ij) is thenumber of bonds adjoining the atom of input number i and having a bondkind number being j, and d_(ij) is the number of routes that can betraced from the atom of input number i through j bonds in the shortestpath;

said canonical number assignment routine is arranged so that when in aprocess for assigning a canonical number to each atom in the ascendingorder from 1 the canonical number 1 is given to an atom with a highestpriority of said class number and thereafter canonical numbers up to thecanonical number n are assigned in that manner, said canonical numberassignment routine selects an atom with a minimum canonical number outof atoms already having their respective canonical numbers and bondingto an atom having no canonical number yet and then gives a canonicalnumber n+1 to an atom with a highest priority of said class number outof atoms bonding to said selected atom and having no canonical numberyet; and

said canonical data preparation routine gives three types of attributes(P_(i), T_(i), S_(i)) to each atom and aligns these attributes in lineto prepare said canonical data,

where among said three types of attributes (P_(i), T_(i), S_(i)), P_(i)is a canonical number of an atom bonding to an atom of canonical numberi and having a minimum canonical number, T_(i) is a symbol for a type ofa bond between the atom of canonical number i and the atom of canonicalnumber P_(i), and S_(i) is a symbol for a kind of the atom of canonicalnumber i.

The computer-usable medium according to the present invention ispreferably a disk type recording medium or a tape type recording medium.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram to show the structure of an example of thebiochemical information processing apparatus of the present invention.

FIG. 2 is an example of a reaction path diagram to show a path in whicha compound of compound number C₁ changes up to a compound of compoundnumber C₇.

FIG. 3 is a drawing to show the structure of a compound informationfile.

FIG. 4 is a drawing to show the structure of an enzyme information file.

FIG. 5 is a drawing to show the structure of a receptor informationfile.

FIG. 6 is a drawing to show the structure of an example of the relationinformation file according to the present invention.

FIG. 7 is a drawing to show the flow of data in the biochemicalinformation processing apparatus.

FIG. 8A is a drawing to show a specific example of image data,

FIG. 8B a specific example of bond table data, and

FIG. 8C a specific example of canonical data, respectively.

FIG. 9A is a drawing to show a specific example of image data,

FIG. 9B a specific example of bond table data, and

FIG. 9C a specific example of canonical data, respectively.

FIGS. 10A-10C are drawings to show the relationship between image dataand canonical data.

FIG. 11 is a flowchart to show the flow of process of a main routine.

FIG. 12 is a flowchart to show the flow of process of athree-dimensional indication routine.

FIG. 13 is a flowchart to show the flow of process of a reaction schemedetection routine.

FIG. 14 is a flowchart to show the flow of process of a reaction pathdetection routine.

FIG. 15 is a flowchart to show the flow of process of the reaction pathdetection routine.

FIG. 16 is a drawing to show an example of indication on a display.

FIG. 17 is a drawing to show another example of indication on thedisplay.

FIG. 18A is a drawing to show the contents of an atomic table in thebond table, and

FIG. 18B is a drawing to show the contents of an atomic pair table inthe bond table.

FIG. 19 is a schematic drawing to show the schematic operation of acanonical data preparing apparatus.

FIG. 20 is a flowchart to show the schematic process of the mainroutine.

FIG. 21 is a flowchart to show the schematic process of the constituentatom classification routine.

FIG. 22A is a drawing to show the contents of an atomic table in thebond table, and

FIG. 22B is a drawing to show the contents of an atomic pair table inthe bond table.

FIG. 23 is a drawing to show the relationship between each of the atomsconstituting 3,5-dimethyl-2,3,4,5-tetrahydropyridine and an input numberthereof.

FIGS. 24A and 24B are drawings each showing the data contents of thereference table.

FIG. 25 is a drawing to show three types of attributes (a_(i), b_(ij),d_(ij)) given to each of the atoms constituting3,5-dimethyl-2,3,4,5-tetrahydropyridine.

FIGS. 26A and 26B are drawings each showing the data contents of thereference table.

FIG. 27 is a drawing to show the data contents of the reference table.

FIGS. 28A and 28B are drawings each showing the data contents of thereference table.

FIGS. 29A and 29B are drawings each showing the data contents of thereference table.

FIGS. 30A-30C are drawings to show the relationship between each of theatoms constituting 3,5-dimethyl-2,3,4,5-tetrahydropyridine and a classnumber thereof.

FIG. 31 is a drawing to show attributes V_(ij) ¹ given to the respectiveatoms constituting 3,5-dimethyl-2,3,4,5-tetrahydropyridine.

FIG. 32 is a drawing to show attributes V_(ij) ² given to the respectiveatoms constituting 3,5-dimethyl-2,3,4,5-tetrahydropyridine.

FIG. 33 is a flowchart to show the schematic process of a canonicalnumber assignment routine.

FIG. 34 is a drawing to show the relationship between each of the atomseach constituting 3,5-dimethyl-2,3,4,5-tetrahydropyridine and acanonical number thereof.

FIG. 35 is a flowchart to show the schematic process of a canonical datapreparation routine.

FIG. 36A is a drawing to show the contents of an atomic table in thebond table, and

FIG. 36B is a drawing to show the contents of an atomic pair table inthe bond table.

FIG. 37 is a drawing to show the data contents of canonical treestructure data.

FIG. 38A is a molecular structure diagram of C₆₀ and

FIG. 38B is canonical data thereof.

FIG. 39 is a block diagram to show the structure of another example ofthe biochemical information processing apparatus of the presentinvention.

FIG. 40 is a block diagram to show the structure of an example of thecanonical data preparing apparatus according to the present invention.

FIG. 41 is a block diagram to show the structure of still anotherexample of the biochemical information processing apparatus of thepresent invention.

FIG. 42 is a drawing to show the structure of another example of therelation information file according to the present invention.

FIG. 43 is a flowchart to show the flow of process of another example ofthe main routine.

FIG. 44 is a block diagram to show the structure of an example of thebiochemical information storage medium of the present invention.

FIG. 45 is a block diagram to show the structure of an example of thebiochemical information processing apparatus according to the presentinvention.

FIG. 46 is a perspective view to show an example of the biochemicalinformation processing apparatus according to the present invention.

FIG. 47 is a block diagram to show the structure of another example ofthe biochemical information storage medium of the present invention.

FIG. 48 is a block diagram to show the structure of an example of arecording medium for preparation of canonical data according to thepresent invention.

FIG. 49 is a block diagram to show the structure of another example ofthe canonical data preparing apparatus according to the presentinvention.

FIG. 50 is a block diagram to show the structure of still anotherexample of the biochemical information storage medium of the presentinvention.

BEST MODE FOR CARRYING OUT THE INVENTION

The preferred embodiments of the present invention will be describedwith reference to the accompanying drawings. FIG. 1 is a block diagramto show the structure of the biochemical information processingapparatus 1 according to an embodiment of the present invention.Referring to the drawing, the biochemical information processingapparatus 1 of the present embodiment comprises an image memory 10 forstoring image data to indicate a molecular structure diagram or the likeof a compound, a work memory 11 for temporarily storing data, a firststorage device 20 for storing an operating system (OS) 21 and abiochemical information processing program 22, and a second storagedevice 30, being storage means, for storing various files. Further, itcomprises a display 40 being display means, an input device 50, which isinput means, having a mouse 51 for accepting input of image data and akeyboard 52 for accepting input of symbolic data, a printer 60 foroutputting the image data or the like, and a CPU 70 for controllingexecution or the like of the biochemical information processing program22.

The biochemical information processing program 22 comprises a mainprogram 23 for generally controlling processing, a three-dimensionalindication program 24 for effecting three-dimensional indication ofimage data, a reaction scheme detection program 25 being reaction schemedetection means, a receptor information detection program 26 beingreceptor information detection means, and a reaction path detectionprogram 27 being reaction path detection means. The reaction schemedetection program 25 is a program for detecting a chemical reactionscheme concerning a compound as being a substrate and/or a product,which comprises first process routine 25 a to fourth process routine 25d. The receptor information detection program 26 is a program fordetecting additional information about a receptor, which comprises fifthprocess routine 26 a to eighth process routine 26 d. Further, thereaction path detection program 27 is a program for detecting a reactionpath of plural compounds, which comprises ninth process routine 27 a tothirteenth process routine 27 e.

The receptor information detection program 26 can handle not onlyreceptors intrinsic to living bodies, such as hormone receptors, butalso receptors of drugs or the like, and conceptual receptors existenceof which is not confirmed yet.

The second storage device 30 comprises a compound information file 31,an enzyme information file 32, a relation (correlation) information file33, a partial correlation data file 34, a bond table file (which willalso be referred to as a bond table information file) 35, and a receptorinformation file 36. Among them, the compound information file 31 storesa list to show the relationship between compound numbers of compoundsand canonical data corresponding to the compounds, and additionalinformation (for example, the reference data of FIG. 3) about thecompounds. The enzyme information file 32 stores a list to show therelationship among enzyme numbers of enzymes, compound numbers ofcompounds being substrates of the enzymes, and compound numbers ofcompounds being products by the enzymes, and additional information (forexample, the reference data of FIG. 4) about the enzymes. Further, therelation information file 33 stores a list to show the relationshipamong compound numbers of compounds, enzyme numbers of enzymes with arelevant compound being a substrate, enzyme numbers of enzymes with arelevant compound being a product, receptor numbers of receptors with arelevant compound being an agonist, and receptor numbers of receptorswith a relevant compound being an antagonist. Furthermore, the partialcorrelation data file 34 is prepared to store the reaction pathinformation while the bond table file 35 to store the bond table data,respectively. Moreover, the receptor information file 36 stores a listto show the relationship among receptor numbers of receptors, compoundnumbers of compounds being agonists of the receptors, and compoundnumbers of compounds being antagonists of the receptors, and additionalinformation (for example, the reference data of FIG. 5) about thereceptors.

Next explained is the detailed structure of the compound informationfile 31, enzyme information file 32, relation information file 33, andreceptor information file 36. FIG. 2 is an example of a reaction pathdiagram to show a path through which a compound of compound number C₁changes in order to compounds of compound numbers C₂, C₃, . . . withplural enzymes of enzyme numbers E₁ to E₆ as a catalyst, finallychanging into a compound of compound number C₇, and is also an exampleof a drawing to show circumstances in which compounds C₆-C₁₂ serve as anagonist or as an antagonist to receptors R₁-R₄.

The compound numbers C₁-C₇ described in this example of reaction pathdiagram are recorded in the compound information file 31 shown in FIG.3. The compound information file 31 includes a record of canonical datacorresponding to each compound of compound number C₁-C₇, and thereference data (name, literature, physical properties, etc.) about eachcompound of compound number C₁-C₇ in the form of a list corresponding tothe compound numbers C₁-C₇. When access is made to the compoundinformation file 31, using the compound number C₁-C₇ as a key, thecanonical data and reference data can be read out as to each compound ofcompound number C₁-C₇. Here, the canonical data is a plurality ofsymbolic data for uniquely specifying the chemical structure of eachcompound. The details of the canonical data will be describedhereinafter.

The enzyme numbers E₁-E₆ described in the example of reaction pathdiagram of FIG. 2 are recorded in the enzyme information file 32 shownin FIG. 4. The enzyme information file 32 includes a record of thecompound numbers C₁-C₆ of compounds being substrates of the respectiveenzymes of enzyme numbers E₁-E₆, the compound numbers C₂-C₇ of compoundsbeing products by the respective enzymes of enzyme numbers E₁-E₆, andthe reference data (name, literature, physical properties, inhibitor,inducer, activator, etc.) about each enzyme of enzyme number E₁-E₆ inthe form of a list corresponding to the enzyme numbers E₁-E₆.

Therefore, when access is made to the enzyme information file 32 usingthe enzyme number E₁-E₆ as a key, the compound numbers C₁-C₇ being thesubstrate and product, and the reference data can be read out as to eachenzyme of enzyme number E₁-E₆. It is also possible to similarly handlereactions by enzymes not subjected to enzyme classification or toidentification of enzyme yet, nonenzymatic reactions involving light,heat, acid, base, metal ion, or the like, and multi-step reactions by aplurality of enzymes.

Further, the receptor numbers R₁-R₄ are recorded in the receptorinformation file 36 shown in FIG. 5. The receptor information file 36includes a record of the compound numbers C₆, C₁₀-C₁₂ of the compoundsbeing agonists of the respective receptors of receptor numbers R₁-R₄,the compound numbers C₇-C₉ of the compounds being antagonists of therespective receptors of receptor numbers R₁-R₄, and the reference data(name, literature, physical properties, action, etc.) about eachreceptor of receptor number R₁-R₄ in the form of a list corresponding tothe receptor numbers R₁-R₄.

Therefore, when access is made to the receptor information file 36,using the receptor number R₁-R₄ as a key, the compound numbers C₆-C₁₂being the agonist and antagonist, and the reference data can be read outas to each receptor of the receptor number R₁-R₄.

Furthermore, the mutual relation among compound numbers C_(l)-C₁₂,enzyme numbers E₁-E₆, and receptor numbers R₁-R₄ is recorded in therelation information file 33 shown in FIG. 6. Describing in more detail,the enzyme numbers E₁-E₆ of enzymes with each compound of compoundnumber C₁-C₆ being a substrate, the enzyme numbers E₁-E₆ of enzymes witheach compound of compound number C₂-C₇ being a product, and the enzymenumber E₄ of the enzyme inhibited by the compound of compound number C₆are recorded in the form of a list corresponding to the compound numbersC₁-C₇. In addition, the receptor numbers R₁-R₄ of receptors with eachcompound of compound number C₆, C₁₀-C₁₂ being an agonist, and thereceptor numbers R₂, R₄ of receptors with each compound of compoundnumber C₇-C₉ being an antagonist are recorded in the form of a listcorresponding to the compound numbers C₆-C₁₂.

Therefore, when access is made to the relation information file 33,using the compound number C₁-C₇ as a key, it is possible to read out theenzyme numbers E₁-E₆ of the enzymes with each compound of compoundnumber C₁-C₇ being a substrate or a product, and the enzyme number E₄ ofthe enzyme inhibited by the compound of compound number C₆. When accessis made to the relation information file 33, using the compound numberC₆-C₁₂ as a key, it is possible to read out the receptor numbers R₁-R₄of the receptors with each compound of compound number C₆-C₁₂ being anagonist or an antagonist.

Next, the data contents of the enzyme information file 32 will beexplained specifically. First, from the reaction path diagram of FIG. 2,a compound number of a compound being a substrate for the enzyme ofenzyme number E₁ is C₁. A compound number of a compound being a productby the enzyme of the enzyme number E₁ is C₂. Therefore, C₁ is recordedin the column of (substrate) compound number corresponding to the enzymenumber E₁ in the enzyme information file 32 of FIG. 4. In addition, C₂is recorded in the column of (product) compound number corresponding tothe enzyme number E₁.

Similarly, from the reaction path diagram of FIG. 2, a compound numberof a compound being a substrate for the enzyme of enzyme number E₂ isC₂. Further, a compound number of a compound being a product by theenzyme of enzyme number E₂ is C₃. Therefore, C₂ is recorded in thecolumn of (substrate) compound number corresponding to the enzyme numberE₂ in the enzyme information file 32 of FIG. 4. Also, C₃ is recorded inthe column of (product) compound number corresponding to the enzymenumber E₂.

Such relation also holds for the enzyme numbers E₃-E₆ similarly, so thatthe compound numbers C₃-C₇ along the reaction path diagram of FIG. 2 arerecorded in each of the columns of (substrate) compound number and(product) compound number corresponding to the enzyme numbers E₃-E₆.

Next, the data contents of the receptor information file 36 will bedescribed specifically. As shown in FIG. 5, the compound number C₆ ofthe compound being an agonist for a receptor of receptor number R₁ isrecorded in the column of (agonist) compound number. Also, a compoundnumber C₈ of a compound being an antagonist for a receptor of receptornumber R₂ is recorded in the column of (antagonist) compound number.Further, compound numbers C₁₀, C₁₁ of compounds being agonists for areceptor of receptor number R₃ are recorded in the column of (agonist)compound number. Furthermore, a compound number C₁₂ of a compound beingan antagonist for a receptor of receptor number R₄ is recorded in thecolumn of (agonist) compound number while compound numbers C₇, C₉ ofcompounds being antagonists for the receptor of receptor number R₄ arerecorded in the column of (antagonist) compound number. The relationbetween these receptor numbers and compound numbers is apparent from thereaction path diagram of FIG. 2.

Next, the data contents of the relation information file 33 will bedescribed specifically. First, from the reaction path diagram of FIG. 2,the enzyme number of the enzyme with the compound of compound number C₁being a substrate is E₁. Therefore, E₁ is recorded in the column of(substrate) enzyme number corresponding to the compound number C₁ in therelation information file 33 of FIG. 6.

Similarly, from the reaction path diagram of FIG. 2, the enzyme numberof the enzyme with the compound of compound number C₂ being a substrateis E₂. Also, the enzyme number of the enzyme with the compound ofcompound number C₂ being a product is E₁. Therefore, E₂ is recorded inthe column of (substrate) enzyme number corresponding to the compoundnumber C₂ in the relation information file 33 of FIG. 6. Also, E₁ isrecorded in the column of (product) enzyme number corresponding to thecompound number C₂.

Such relation also holds for the compound numbers C₃-C₇ similarly, sothat the enzyme numbers E₂-E₆ along the reaction path diagram of FIG. 2are recorded in each of the columns of (substrate) enzyme number and(product) enzyme number corresponding to the compound numbers C₃-C₇(which are used as a key upon search using the relation information file33). Further, the compound of compound number C₆ is a substrate for theenzyme number E₆ and a product for the enzyme number E₆, while being aninhibitor for the enzyme number E₄, and thus, E₄ is recorded in thecolumn of (inhibition) enzyme number.

Furthermore, the receptor number R₁ of an agonist for the compound ofcompound number C₆ is recorded in the column of (agonism) receptornumber. Also, the receptor number R₄ of an antagonist for the compoundof compound number C₇ is recorded in the column of (antagonism) receptornumber. Following in the similar fashion, the receptor numbers R₂-R₄ ofagonist/antagonist for the compounds of compound numbers C₈-C₁₂ arerecorded in each column of (agonism) receptor number/(antagonism)receptor number.

Next, the flow of data in the biochemical information processingapparatus 1 is shown in FIG. 7. First, an operator draws a molecularstructure diagram on the display 40 using the mouse 51, and then thismolecular structure diagram is stored as image data 80 in the imagememory 10. This image data 80 can be converted into either one of bondtable data 81, canonical data 82, and three-dimensional data 83.

Conversion between the image data 80 and the bond table data 81 can bemade using a graphic library corresponding to the OS used. Theconversion algorithm between the bond table data 81 and the canonicaldata 82 will be described in detail hereinafter. The conversionalgorithm between the bond table data 81 and the three-dimensional data83 is described in “Abstracts, The 13th symposium of informationscience, p 25” by the present inventor.

The bond table data 81 after conversion is stored in the bond table file35, the canonical data 82 in the work memory 11, and thethree-dimensional data 83 in the image memory 10, respectively. When theoperator gives input of symbolic data 84 indicating a name or the like,using the keyboard 52, a search process 84 b by a character string iscarried out to the compound information file 31, and compound table data81 is made from canonical data of a relevant compound. This bond tabledata 81 can also be converted similarly into either of the image data 80and the three-dimensional data 83. In contrast, when the symbolic data84 indicating an enzyme name or the like is input, the search process 84b by a character string is carried out to the enzyme information file 32to read a corresponding enzyme number out thereof, which can be used forthe subsequent processes.

FIG. 8A to FIG. 8C show a specific example of image data 80 a, bondtable data 81 a, and canonical data 82 a. FIG. 8A is the image data 80ato show the molecular structure of compound “4-methylpyridine”. Thisimage data 80 a can be converted into the bond table data 81 a shown inFIG. 8B. The bond table data 81 a is a table in which the number ofatoms, the number of bonds, coordinates of each atom, an element symbolof each element, and so on are recorded. Using this bond table data 81a, structures of all compounds can be expressed as numerical data.

Further, the bond table data 81 a can be converted into the canonicaldata 82 a shown in FIG. 8C. The canonical data 82 a is a symbolic stringincluding an array of numerals, marks, and so on. As shown in FIG. 8C,the canonical data 82 a of compound “4-methylpyridine” “1%1%1-2%3%5%N/6%7/”. In this way, the canonical data 82 a can express the structureof a compound in the form of a very short symbolic string. Because ofit, if this canonical data 82 a is applied, for example, to a compoundsearch system, the search speed can be increased and the storageresource can be effectively utilized.

It is, however, not easy to uniquely specify a compound with the bondtable data described above, and it is thus not suitable to apply thebond table data to the compound search system. Namely, as shown in FIG.9A to FIG. 9C, the image data 80 b is the data expressing the samecompound as the image data 80 a, but the bond table data 81 b is utterlydifferent from the bond table data 81 a. It is seen from this that acompound cannot be uniquely specified from the bond table data. Incontrast with it, the canonical data 82 b obtained by converting thebond table data 81 b is the same as the canonical data 82 a, and canuniquely specify the compound.

In the bond table data 81 a and 81 b, the table with each data recordedis separated into a table of from atom number to mass and a table offrom bonding atom pair to UP/DOWN. Accordingly, for example in the bondtable data 81 a, the atom number (4) and the element symbol (N)correspond to each other, but the atom number (4) does not correspond tothe bonding atom pair (4 5), the type of bond (1) and UP/DOWN (0).

Particularly, as shown in FIG. 10A to FIG. 10C, two image data 80 c, 80d are completely different from each other when looked at, though theboth are image data indicating a same compound. The canonical data 82 cresulting from conversion of such image data 80 c, 80 d is the same,thus proving that the canonical data can uniquely specify a compound.

As described, the canonical data is more excellent than the bond tabledata in that it can uniquely specify a compound, and therefore, thecanonical data is mainly used in each process of the biochemicalinformation processing apparatus 1 of the present embodiment.

On the other hand, since the bond table data has the coordinate data, itis useful to display a molecular structure diagram of compound on thedisplay 40. Further, the two-dimensional coordinate data (X-coordinateand Y-coordinate) can be obtained by calculation from other data in thebond table data (though it is of course necessary to preliminarilydesignate the lengths of bonds, angles between bonds, the position ofthe center when displayed on the display, and so on).

Next, the biochemical information processing method according to theembodiment of the present invention will be explained. The biochemicalinformation processing apparatus 1 is used for this processing method.First, under control of OS 21, the main program 23 of the biochemicalinformation processing program 22 is started.

In the main program 23, as shown in the flowchart of FIG. 11, aselection screen of input method is first indicated on the display 40(S100). When in accordance with this screen indication the operatorselects input through the mouse 51 (S101), a screen for drawing ofmolecular structure diagram is indicated on the display 40. When theoperator next inputs a molecular structure diagram indicating thestructure of a predetermined compound using the mouse 51, this graphicimage is accepted as image data to be stored in the image memory 10(S102). This image data is also indicated on the display 40 (S103). Thenthis image data is converted into bond table data in accordance with theconversion algorithm discussed above (S104).

When in accordance with the screen indication of S100 the operatorselects input through the keyboard 52 (S101), a symbolic string inputscreen is indicated on the display 40. When the operator next givesinput of a symbolic string of a compound name, a chemical formula, orthe like for specifying a predetermined compound using the keyboard,this input is accepted (S105), search of a compound specified by thissymbolic string (S106) is carried out to the compound information file31, and the bond table data 81 is prepared from the canonical data 82 ofthe pertinent compound (S106 b). Then the bond table data is convertedinto image data, based on the aforementioned two-dimensional coordinatedata (S107), and this image data is indicated on the display 40 (S108).

On the other hand, when through input by the keyboard 52 symbolic data84 indicating an enzyme name or the like is given, search by a characterstring (S106) is carried out to the enzyme information file 32 and apertinent enzyme number is read out thereof to be used in similarprocessing.

After completion of processing at S104 and at S108, a selection screenfor selecting either one of the following processes is indicated on thedisplay 40 (S109). When in accordance with this screen indication theoperator selects a save process of the bond table data, the bond tabledata is written into the bond table file 35 (S111). After completion ofwriting into the bond table file 35, the processing returns to S109.When in accordance with the screen indication of S109 the operatorselects a three-dimensional indication process, the three-dimensionalindication program 24 is called out (S112). The three-dimensionalindication program 24 is a processing program for three-dimensionallyindicating a molecular structure diagram of compound. After completionof the process of three-dimensional indication program 24, theprocessing then returns to S109.

Further, when in accordance with the screen indication of S109 theoperator selects a reaction scheme detection process, the reactionscheme detection program 25 is called (S113). The reaction schemedetection program 25 is a processing program for searching the relationinformation file 33 or the like and detecting a reaction schemeinvolving the compound. After completion of the process of reactionscheme detection program 25, the processing then returns to S109.Furthermore, when in accordance with the screen indication of S109 theoperator selects a reaction path detection process, the reaction pathdetection program 27 is called (S114). The reaction path detectionprogram 27 is a processing program for searching the relationinformation file 33 or the like and detecting a reaction path of pluralcompounds. After completion of the process of reaction path detectionprogram 27 the processing then returns to S109.

Moreover, when in accordance with the screen indication of S109 theoperator selects a receptor information indication process, the receptorinformation detection program 26 is called (S115). The receptorinformation detection program 26 is a processing program for searchingthe relation information file 33 to read out an agonism receptor numberand/or an antagonism receptor number of a specific compound (the sixthprocess routine 26 b), searching the receptor information file 36 todetect the reference data for the receptor of the receptor number thusread out (the seventh process routine 26 c), and further indicating thereference data thus detected (the eighth process routine 26 d). Aftercompletion of the process of receptor information detection program 26,the processing then returns to S109. Furthermore, when in accordancewith the screen indication of S109 the operator selects a terminationprocess, the entire processing of the main program is terminated.

Next explained using the flowchart of FIG. 12 is the process ofthree-dimensional indication program 24 called at S112. In this process,first, the bond table data is converted into the three-dimensional dataof molecular structure diagram in accordance with the above-describedconversion algorithm (S120). Then an input promotion screen as towhether rotation indication or the like of this three-dimensional datais required is indicated on the display 40 (S121). When start of thethree-dimensional indication program 24 is selected on this screen, thethree-dimensional data is converted into image data, using the graphiclibrary corresponding to the OS used (S124), and this image data isindicated on the display 40 (S125). Further, when in accordance withthis the screen indication the operator selects either one of a changeprocess of conformation, a rotation process, an enlargement process, anda reduction process (S122), either of these processes is carried out byordinary formation techniques of three-dimensional graphics (S123).

Next explained using the flowchart of FIG. 13 is the process of reactionscheme detection program 25 called at S113. In this process, first, thebond table data is converted into canonical data in accordance with theconversion algorithm as discussed hereinafter (S130). Then a selectionscreen of search object is indicated on the display 40 (S131). Here, inthe case of the operator selecting a reaction scheme, it is preferablethat the compound input have preliminarily been designated as either asubstrate or a product at previous S102 or S105. Alternatively,immediately before the process of S130 input of designation of either asubstrate or a product may be accepted together with the bond table datafor the compound.

Under such conditions, when in accordance with the screen indication ofS131 the operator selects a reaction scheme (S132), the followingreaction scheme detection process is carried out. In this process,first, access is made to the compound information file 31 to search fora compound (S133). This search process is carried out based on thecanonical data of the compound converted into at S130. When this searchprocess ends with the result that the same canonical data as thecanonical data of the compound does not exist in the compoundinformation file 31 (S134), the process is terminated. If the samecanonical data as the canonical data of the compound exists in thecompound information file 31, the compound number corresponding to thiscanonical data is read out of the compound information file 31.

Based on the compound number (a key) read out at S133, an enzyme number(according to the aforementioned designation) with the compound being asubstrate or a product is read out of the relation information file 33(S135). Further, based on the enzyme number read out at S135, a(substrate) compound number, a (product) compound number, and referencedata corresponding to this enzyme number are read out of the enzymeinformation file 32 (S136).

In this manner a reaction scheme diagram involving the compound isprepared from the compound number read out at S133 and the enzyme numberread out at S135, and the image data of this reaction scheme diagram isindicated on the display 40. Also, the reference data about the enzymeread out at S136 is indicated on the display 40 (S137).

The image data of reaction scheme diagram is indicated on the display 40preferably in such an arrangement that an arrow combines a molecularstructure diagram of the compound of the (substrate) compound numberobtained with a molecular structure diagram of the compound of (product)compound number and that the reference data of enzyme (especially, thename) is placed near the arrow. Conversion from the compound number tothe molecular structure diagram may be carried out, for example, in theorder of the compound number, the bond table data (making access to thebond table file), and the molecular structure diagram (using thetwo-dimensional coordinates).

Here, the first process routine 25 a performs the processes of from S130to S133, and these processes correspond to the first step. Also, thesecond process routine 25 b performs the process of S135, and thisprocess corresponds to the second step. Further, the third processroutine 25 c performs the process of S136, and this process correspondsto the third step. Yet further, the fourth process routine 25 d performsthe process of S137, and this process corresponds to the fourth step.

In the present invention, the first process portion, step and processroutine, the fifth process portion, step and process routine, and theninth process portion, step and process routine may be the same processportion, step and process routine, respectively.

Next, when in accordance with the screen indication of S131 the operatorselects a molecular structure diagram (S132), the following molecularstructure diagram detection process is carried out. In this process,first, access is made to the compound information file 31 to search fora compound of detection object (S138). The search process is carried outbased on the canonical data of the compound converted into at S130. Ifthis search process ends with the result that the same canonical data asthe canonical data of the detection object does not exist in thecompound information file 31 (S139), the process is terminated. If thesame canonical data as the canonical data of the detection object existsin the compound information file 31, the compound number of the compoundcorresponding to this canonical data is read out of the compoundinformation file 31.

Based on the compound number read out al S138, the reference data etc.is read out of the compound information file 31 and relation informationfile 33 (S140). In this manner a molecular structure diagram of thecompound being a detection object is prepared from the compound numberread out at S138, and the image data of this molecular structure diagramis indicated on the display 40. The reference data for this compoundread out at S140 is also indicated on the display 40 (S141).

Next explained using the flowcharts of FIG. 14 and FIG. 15 is theprocess of reaction path detection program 27 called at S114. In thisprocess, first, the bond table data of the center compound is convertedinto canonical data in accordance with the conversion algorithmdiscussed hereinafter, and subsequently, in order to determine areaction path area to be detected, input of the number of predeterminedreaction steps (for example, three reaction steps on the upstream sideand five reaction steps on the downstream side with respect to thecenter compound at the center) is accepted (S150).

Next, access is made to the compound information file 31 to search forthe center compound, based on the canonical data converted into at S150(S151). If this search process ends with the result that the samecanonical data as the canonical data of the center compound does notexist in the compound information file 31 (S152), the process isterminated. If the same canonical data as the canonical data of thecenter compound exists in the compound information file 31, the compoundnumber corresponding to this canonical data is read out of the compoundinformation file 31.

Based on the compound number (a key) read out at S151, an enzyme numberof an enzyme with this compound being a substrate and an enzyme numberof an enzyme with this compound being a product are read out of therelation information file 33 (S153). Further, based on each enzymenumber read out at S153, a compound number of a compound being asubstrate for this enzyme and a compound number of a compound being aproduct by this enzyme are read out of the enzyme information file 32(S154). Then the enzyme numbers read out at S153 and the compoundnumbers read out at S154 are successively added into the partialcorrelation data file 34 (S155).

The processes of from S153 to S155 are repeated for each compound numbernewly read out at S154, and compound numbers of all compounds and enzymenumbers of all enzymes within the reaction path of the predeterminednumber of steps are written into the partial correlation data file 34(S156).

Next, when a predetermined enzyme is designated in the reaction path inaccordance with an instruction of the operator (S157), a compound beinga substrate for this enzyme and a compound being a product by thisenzyme are read out of the compound information file 31 and the enzymeinformation file 32, and reaction scheme data is prepared from thesecompounds and enzyme (S158). Then this reaction scheme data is indicatedon the display 40 (S159). Further, access is made to the partialcorrelation data file 34 to obtain all adjacent reactions of thisreaction scheme, and arrows indicating these adjacent reactions areindicated on the display 40 (S160).

When the operator selects an indication of either one adjacent reaction,based on the reaction scheme data thus indicated on the display 40(S161), the flow returns to the process of S157 to prepare the reactionscheme data for the adjacent reaction.

Here, the ninth process routine 27 a performs the processes of S150 andS151, and these processes correspond to the ninth step. Also, the tenthprocess routine 27 b performs the process of S153, and this processcorresponds to the tenth step. Further, the eleventh process routine 27c performs the process of S154, and this process corresponds to theeleventh step. Furthermore, the twelfth process routine 27 d performsthe process of S156, and this process corresponds to the twelfth step.Moreover, the thirteenth process routine 27 e performs the processes offrom S157 to S161, and these processes correspond to the thirteenthstep.

Examples of indications on the display 40 by the processes of S159 andS160 are shown in FIG. 16 and FIG. 17. From these drawings, the imagedata 80 f, 80 g each indicating the reaction scheme data is displayed onthe display 40 and arrows indicating adjacent reactions are added to theboth ends of the reaction scheme data. Selection of adjacent reaction atS161 is effected by clicking a portion of either one arrow by the mouse51. In this example, when the arrow at the left end of the image data 80f is clicked by the mouse 51, the image data 80 g, which is a reactionone step before, is indicated. Any reaction scheme within the reactionpath can be freely indicated by such switching of screen.

Next explained are canonical data preparation means and method suitablyapplicable to the present invention.

Algorithms applicable as the aforementioned conversion algorithm betweenthe bond table data 81 and the canonical data 82 in either way includethe known Morgan algorithm (H. L. Morgan, J. Chem. Doc., 5(2), 107(1965)) and the conversion algorithm by the present inventor, asdescribed in Atsushi TOMONAGA “A Program Library for ChemicalInformation and Its Applications” Abstracts, The 13^(th) symposium ofinformation science, pages 25-28 (1990). However, the conventionalconversion algorithm by the present inventor was able to obtain thecanonical data more quickly than the Morgan algorithm withoutintervention of a process for classifying atoms into equivalent atoms,but because an attribute of an atom used therein was the number of atomslocated at a specific minimum distance from the pertinent atom, itlacked preciseness of determination of equivalent atom and reliabilityof canonical data obtained was not sufficient yet. Accordingly, thepresent invention particularly preferably employs the canonical datapreparation means and method described in detail in the following.

First explained is the canonical data preparation means suitablyapplicable to the present invention. The biochemical informationprocessing apparatus 1, being the embodiment of the present inventionshown in FIG. 1, comprises the canonical data preparation meansaccording to the present invention; that is, it comprises the imagememory 10 for storing the image data of molecular structure diagram, thework memory 11 for temporarily storing the symbolic data or the like,the first storage device 20 storing the operating system (OS) 21 andcanonical data preparation program 91, and the second storage device 30storing the bond table file 35 and compound information file 31.

The biochemical information processing apparatus 1 comprises the display40 for indicating the molecular structure diagram, the mouse 51 being apointing device for accepting input of hand-drawn graphic image, thekeyboard 52 for accepting input of symbolic data such as a chemicalformula, the printer 60 for outputting the molecular structure diagram,and the CPU 70 for controlling execution or the like of the canonicaldata preparation program 91. The pointing devices include a tablet, adigitizer, a light pen, and so on as well as the mouse 51, and eitherone of these devices may replace the mouse 51.

The canonical data preparation program 91 is a program for preparing thecanonical data based on characteristic data about each of atomsconstituting a compound and bond pair data between atoms. This canonicaldata preparation program 91 comprises a main routine 91 a for generallycontrolling the processing, and a constituent atom classificationroutine (constituent atom classification process portion) 91 b forassigning class numbers to the respective atoms constituting thecompound. The canonical data preparation program 91 also comprises acanonical number assignment routine (canonical number assignment processportion) 91 c for assigning canonical numbers to the respective atoms,based on the class numbers, and a canonical data preparation routine(canonical data preparation process portion) 91 d for preparingcanonical data, based on the canonical numbers of the respective atoms.The second storage device 30 is provided with the bond table file 35capable of storing a plurality of bond tables 81. A bond table 81includes a record of characteristic data about each of the atomsconstituting the compound and bond pair data between atoms, and thecanonical data preparation program 91 can make access to these datathrough the bond table 81.

As shown in FIG. 18A and FIG. 18B, a bond table 81 comprises an atomictable 81 c including a record of characteristic data about therespective atoms, and an atomic pair table 81 d including a record ofbonding pair data between atoms. Specifically, the atomic table 81 c isprovided with columns for input number (also referred to as a number ofatom), two-dimensional coordinates (X-coordinate and Y-coordinate) ofatom, element symbol (which is generally an element name), attribute,the number of atoms, and the number of bonds to be written wherein (seeFIG. 18A), and the atomic pair table 81 d is provided with columns forbond atom pair data, the type of bond (for example, 1 for single bondand 2 for double bond), and the structure (a column for distinction asto whether each atom belongs to a cyclic part or to a chain part ofmolecular structure diagram) to be written therein (see FIG. 18B). Here,the input numbers are numbers for the computer to identify the atomsconstituting the compound, and are numerals in the example of FIG. 18A,but may be symbols. The bonding atom pair data is preferably expressedas a combination of input numbers.

The preparation of canonical data does not require the all data in theabove atomic table 81 c and atomic table 81 d, but sufficient dataincludes the number and element symbol of each atom as characteristicdata and the bonding atom pair data and type of bond as bonding pairdata.

The second storage device 30 stores the compound information file 31including a record of a list to show the relation between a compoundnumber of a compound and canonical data corresponding to the compound.As shown in FIG. 3, the compound information file 31 is a file includinga record of the canonical data corresponding to each compound ofcompound number C₁-C₇ and the reference data (name, literature, physicalproperties, etc.) about each compound of compound C₁-C₇ in the form of alist corresponding to the compound numbers C₁-C₇. Therefore, if accessis made to the compound information file 31 using the compound numberC₁-C₇ as a key, the canonical data and reference data can be read outfor each compound of compound number C₁-C₇. Here, the canonical data isdata comprised of a plurality of symbols for uniquely specifying thechemical structure of each compound.

The constituent atom classification routine 91 b corresponds to theconstituent atom classification step, the canonical number assignmentroutine 91 c to the canonical number assignment step, and the canonicaldata preparation routine 91 d to the canonical data preparation step,respectively.

Next explained is the schematic operation of the canonical datapreparation means. As shown in FIG. 19, the operator manipulates themouse 51 or the keyboard 52 to prepare a bond table 81 of a compound tobecome a preparation object of canonical data in the bond table file 35.

Input through the mouse 51 is handwritten input of the molecularstructure diagram of a compound on the display 40 with the mouse 51, andan input number of each atom defined in the input order is written inthe column of input number in the bond table 81 prepared in the secondstorage device 30. Further, bonding atom pair data indicating the bondrelation of each atom of this molecular structure diagram E₁ is writteninto the column of bonding atom pair in the bond table 81. As described,in the case of the input through the mouse 51, the bond table 81 forspecifying a compound is prepared from the handwritten molecularstructure diagram E₁.

Input through the keyboard 52 is input of a symbolic string forspecifying a bond table name corresponding to a predetermined compoundusing the keyboard 52, and, based on input symbolic data 11 a, a bondtable 81 specified by this bond table name is read out of the bond tablefile 35.

As described, the mouse 51 and keyboard 52 compose input means A (50),and a bond table 81 is obtained using either one of the mouse 51 andkeyboard 52. Then the canonical data preparation program 91, beingcanonical preparation means B, is carried out to prepare the canonicaldata 82, based on each data in the bond table 81. The canonical data 82thus prepared is written into the compound information file 31 to besaved therein. Here, a reason why the canonical data 82 is prepared fromthe bond table 81 to be saved is that a storage area thereof is smallerthan that when the bond table 81 itself is saved and a compound can beuniquely specified. Namely, the canonical data 82 prepared based on thebond table 81 shown in FIGS. 18A and 18B is “1%1%1-2%3%5% N/6%7/”, andcan express the structure of the compound by a very short string ofcharacter, numeral, and symbol and uniquely. By employing such a shortsymbolic string as an object of save, the storage resource can beeffectively utilized, which can contribute to size and weight reductionsof apparatus.

The two-dimensional coordinate calculation process is carried out basedon each data in the bond table 81, thereby obtaining two-dimensionalcoordinate data of each atom. A molecular structure diagram E₂,excellent in an aesthetic sense, is prepared from the two-dimensionalcoordinate data thus obtained. The molecular structure diagram E₂ thusprepared can be indicated on the display 40 or can be output from theprinter 60.

The input through the keyboard 52 may be arranged to directly write theaforementioned data or the like to indicating bonding states of atomsinto the bond table 81 prepared in the second storage device 30. Inputof bond table data may be accepted using a device for optically readinggraphics or characters, such as an image scanner or an optical cardreader (OCR), as the input device of the present invention.

Next explained is the canonical data preparation method being theembodiment according to the present invention. The canonical datapreparation means described above is used for this preparation method.First, the main routine 91 a of the canonical data preparation program91 is started under control of OS 21.

As shown in the flowchart of FIG. 20, the main routine 91 a first callsthe constituent atom classification routine 91 b to assign a classnumber to each of atoms forming a compound (S910). Next, the canonicalnumber assignment routine 91 c is called to assign a canonical number toeach atom, based on the class numbers assigned to the respective atoms(S920). Further, the canonical data preparation routine 91 d is calledto prepare canonical data, based on the canonical numbers assigned tothe respective atoms (S930). The canonical data thus prepared is writteninto the compound information file 31 to be saved therein.

Next explained is the process of constituent atom classification routine91 b called at S910. This process is a process for classifying each ofthe atoms constituting the compound into different classes each forequivalent atoms and giving each atom a class number corresponding to aclass to which the each atom belongs. For example, since all atoms ofbenzene are equivalent, a same class number is given to the all. Incontrast, since each atom of toluene is not equivalent to each other,different class numbers are given to the respective atoms.

As shown in the flowchart of FIG. 21, first, three types of attributes(a_(i), b_(ij), d_(ij)) are given to each of the atoms constituting thecompound, based on the bond table 81 (S911). Here, attribute a_(i) is akind number of an atom of input number i (which is an atomic number inthis example). Also, attribute b_(ij) is the number (vector quantity) ofbonds that are bonds adjacent to an atom of input number i and bondswith a kind number thereof (which is a type of bond in this example (1for single bond, 2 for double bond, 3 for triple bond, 4 for aromaticbond, . . . )) being j. Further, attribute d_(ij) is the number (vectorquantity) of routes that can be traced from an atom of input number ivia j bonds in the shortest path.

Next, the attributes (a_(i), b_(ij), d_(ij)) are arranged for each atomto obtain a 9-digit numeral string, class numbers C_(i) ⁰ are given tothe atoms in the ascending order of the numeral strings from thesmallest, and then the atoms are classified into a plurality of classes(S912). The class numbers C_(i) ⁰ given herein are zeroth-degree classnumbers, and first-degree class numbers C_(i) ¹, second-degree classnumbers C_(i) ², . . . are successively obtained in the loop processafter S913.

Next, the degree n is set to 1 (S913). Then attribute V_(ij) ¹ is givento each atom (S914). The attribute V_(ij) ^(n) is the number of atomsbonding to an atom of input number i and having a class number j in thedegree n−1. Further, attributes (a_(i), b_(ij), d_(ij), V_(ij) ^(n)) arearranged for each atom, class numbers C_(i) ^(n) are given in theascending order of the numeral strings from the smallest, and the atomsare classified into a plurality of classes (S915). Then it is checkedwhether the number N_(n) of classes is equal to N_((n−1)), and theprocess is terminated if equal. Or, it is checked whether the numberN_(n) of classes is equal to the total atom number, and the process isterminated if equal (S916). When neither is equal, 1 is added to n andthe processing returns to S914 (S917).

Next, the process in each step of constituent atom classificationroutine 91 b will be explained in detail with an example of3,5-dimethyl-2,3,4,5-tetrahydropyridine.

First executed is the process of S911. Upon execution of this processthe data as shown in FIGS. 22A and 22B has already been written in thebond table 81 and, based on each data written in the bond table 81, thethree types of attributes (a_(i), b_(ij), d_(ij)) are given to eachatom. Here, the input numbers recorded in this bond table 81 arearbitrary numbers given in the order of handwritten input of each atom,as shown in FIG. 23.

The attribute a_(i) is gained as follows. As described previously, theattribute a_(i) is a kind number of atom of input number i. Here, anelement symbol of each atom is recorded in the bond table 81, and thekind numbers can be attained from these element symbols. Therefore, byreading an element symbol out of the bond table 81, the attribute a_(i)corresponding to this element symbol can be obtained. As a result, weobtain a₁, a₂, a₄-a₈=6, and a₃=7.

The attribute a_(ij) is obtained as follows. As discussed previously,the attribute b_(ij) is the number of bonds adjoining an atom of inputnumber i and having a bond kind number thereof being j. A type of bondof each atom is recorded in the bond table 81, and the attribute b_(ij)can be attained by reading this type of bond out of the bond table 81.As a result, we obtain b_(1j)=(3, 0, 0, 0), b_(2j)=(1, 1, 0, 0),b_(3j)=(1, 1, 0, 0), b_(4j)=(2, 0, 0, 0), b_(5j)=(3, 0, 0, 0),b_(6j)=(2, 0, 0, 0), b_(7j)=(1, 0, 0, 0), and b_(8j)=(1, 0, 0, 0).

Specifically, the attribute b_(ij) is obtained using the reference tableT shown in FIGS. 24A and 24B. The reference table T is formed as amatrix D(x, y) indicating the bond relation between two atoms, and isprepared based on the data of bonding atom pair and type of bond in thebond table 81. Namely, a type of bond j is written in a matrix elementindicated by each bonding atom pair, thus preparing the reference tableT.

Extraction of attribute b_(ij) using this reference table T is carriedout as follows. First, matrix elements satisfying x=1 or y=1 (the matrixelements hatched in FIG. 24A) are checked among those of the referencetable T to extract data (type of bond) j written in the matrix elements.As a result, we obtain D(1, 2)=1, D(1, 6)=1, and D(1, 8)=1. Since alldata j of the three matrix elements thus obtained are 1, we obtainb₁₁=3. Since there is no matrix element with data j being two or more,we obtain b₁₂-b₁₄=0.

Next, matrix elements satisfying X=2 or Y=2 (the matrix elements hatchedin FIG. 24B) are checked among those of the reference table T to extractdata written in the matrix elements. As a result, we obtain D(1, 2)=1and D(2, 3)=2. The data j of the matrix elements thus obtained is 1, 2,each of which is one, and thus, b₂₁=b₂₂=1. Since there is no matrixelement with data j being 3 or more, we obtain b₂₃=b₂₄=0.

Through the same process for i=3-8, the attributes b_(ij) (i=1-8, j=1-4)shown in FIG. 25 are attained.

Further, the attribute d_(ij) is obtained as follows. As discussedpreviously, the attribute d_(ij) is the number of routes that can betraced from an atom of input number i through j bonds in the shortestpath. Specifically, describing it based on the molecular structurediagram of FIG. 23, routes that can be traced from the atom of inputnumber 1 through one bond are three in total: (input number 1 to inputnumber 2); (input number 1 to input number 6); (input number 1 to inputnumber 8). Routes that can be traced from the atom of input 1 throughtwo bonds are two in total: (input 1 to input number 2 to input number3); (input 1 to input number 6 to input number 5).

Further, routes that can be traced from the atom of input 1 throughthree bonds in the shortest path are three in total: (input 1 to inputnumber 2 to input number 3 to input number 4); (input 1 to input number6 to input number 5 to input number 4); (input 1 to input number 6 toinput number 5 to input number 7). Moreover, there is no route tracingfrom the atom of input 1 through four bonds in the shortest path. Fromthe results of the above processes, we obtain d_(ij)=(3, 2, 3, 0).

Through the same processes, we obtain d_(2j)=(2, 3, 2, 2), d_(3j)=(2, 2,4, 0), d_(4j)=(2, 3, 2, 2), d_(5j)=(3, 2, 3, 0), d_(6j)=(2, 4, 2, 0),d_(7j)=(1, 2, 2, 3), and d_(8j)=(1, 2, 2, 3).

Specifically, the attributes d_(ij) are obtained referring to thereference table T in the same manner as the attributes b_(ij). Thisextraction of attributes d_(ij) referring to the reference table T iscarried out in the order of i=1, i=2, . . . . The attribute d_(1j) (i=1)is first extracted.

The extraction of attribute d_(1j) (i=1) is to check matrix elementssatisfying X=1 or Y=1 (the matrix elements hatched in FIG. 26A) amongthose of the reference table T and to extract a matrix element in whichdata is written. Then, 1 is written as a bond path number in each matrixelement extracted. As a result, the bond path 1 is written in D(1, 2),D(1, 6), and D(1, 8) (each bond path number is shown as enclosed in atriangle in FIG. 26A).

Next extracted are suffixes S=(1, 2), (1, 6), (1, 8) of the matrixelements each having the bond path number 1 written. From these suffixesS, 1, which has been used in the previous extraction process, isexcluded, thus obtaining S=2, 6, 8. Based. on S=2, 6, 8 thus obtained,matrix elements satisfying X=2, 6, 8 or Y=2, 6, 8 (the matrix elementshatched in FIG. 26B) are checked to extract a matrix element with datawritten therein and with no bond path number written yet. Then, 2 iswritten as a bond path number in each matrix element extracted. As aresult, the bond path number 2 is written in D(2, 3) and D(5, 6).

Further, extracted are suffixes S=(2, 3), (5, 6) of the matrix elementswith the bond path number 2 written therein. From these suffixes S, 2,6, having already been used in the previous extraction process, areexcluded, thus obtaining S=3, 5. Based on S=3, 5 thus obtained, matrixelements satisfying X=3, 5 or Y=3, 5 (the matrix elements hatched inFIG. 27) are checked to extract a matrix element with data writtentherein and without no bond path number written yet. Then, 3 is writtenas a bond path number in each matrix element extracted. As a result, thebond path number 3 is written in D(3, 4), D(4, 5), and D(5, 7).

Through the above processes, the bond path numbers are written in theall matrix elements. As a result, there are three matrix elements withthe bond path number 1, two matrix elements with the bond path number 2,three matrix elements with the bond path number 3, and no matrix elementwith the bond path number 4, thus attaining d_(1j)=(3, 2, 3, 0).

Next, the attribute d_(2j) (i=2) is extracted. The extraction ofattribute d_(2j) (i=2) is to check matrix elements satisfying X=2 or Y=2(the matrix elements hatched in FIG. 28A) among those of the referencetable T and to extract a matrix element with data written therein. Then,1 is written as a bond path number in each matrix element extracted. Asa result, the bond path 1 is written in D(1, 2) and D(2, 3) (each bondpath number is shown as enclosed in a triangle in FIG. 28A).

Next extracted are suffixes S=(1, 2), (2, 3) of matrix elements eachwith the bond path number 1 written therein. Excluding 2, having alreadybeen used in the previous extraction process, from these suffixes S, weobtain S=1, 3. Based on S=1, 3 thus obtained, matrix elements satisfyingX=1, 3 or Y=1, 3 (the matrix elements hatched in FIG. 28B) are checkedto extract a matrix element with data written therein and with no bondpath number written yet. Then, 2 is written as a bond path number ineach matrix element extracted. As a result, the bond path number 2 iswritten in D(1, 6), D(1, 8), and D(3, 4).

Further, extracted are suffixes S=(1, 6), (1, 8), (3, 4) of the matrixelements with the bond path number 2 written therein. Excluding 1, 3,having already been used in the previous extraction process, from thesesuffixes S, we obtain S=4, 6, 8. Based on S=4, 6, 8 thus obtained,matrix elements satisfying X=4, 6, 8 or Y=4, 6, 8 (the matrix elementshatched in FIG. 29A) are checked to extract a matrix element with datawritten therein and with no bond path number written yet. Then, 3 iswritten as a bond path number in each matrix element extracted. As aresult, the bond path number 3 is written in D(4, 5) and D(5, 6).

Furthermore, extracted are suffixes S=(4, 5), (5, 6) of the matrixelements with the bond path number 3 written therein. Excluding 4, 6,having already been used in the previous extraction process, from thesesuffixes S, we obtain S=5, 5 (which means that S=5 is doubly applied).Based on S=5, 5 thus obtained, matrix elements satisfying X=5 or Y=5(the matrix elements hatched in FIG. 29B) are checked to extract amatrix element with data written therein and with no bond path numberwritten therein yet. Then, 4 is written as a bond path number in eachmatrix element extracted. As a result, two of the bond path number 4 arewritten in D(5, 7).

Through the above processes, the bond path numbers are written in theall matrix elements. As a result, there are two matrix elements with thebond path number 1, three matrix elements with the bond path number 2,two matrix elements with the bond path number 3, and two matrix elementswith the bond path number 4, thus attaining d_(1j)=(2, 3, 2, 2).

By the same processes for i=3 to 8, d_(ij) (i=1 to 8, j=1 to 4) shown inFIG. 25 are attained. The process of S911 as described above gave thethree types of attributes (a_(i), b_(ij), d_(ij)) to each of the atomsconstituting 3,5-dimethyl-2,3,4,5-tetrahydropyridine.

Next executed is the process of S912. As described above, at S912 theattributes (a_(i), b_(ij), d_(ij)) for each atom are arranged in a9-digit numeral string, and class numbers C_(i) ⁰ are given to the atomsin the ascending order of the numeral strings from the smallest, thusclassifying the atoms into a plurality of classes. The class numbersC_(i) ⁰ given herein are zeroth-degree class numbers.

Describing the process of S912 specifically, the numeral string of theatom of input 1 is “630003230” and the numeral string of the atom ofinput number 2 is “611002322”. Following it in order, we obtain“711002240”, “620002322”, “630003230”, “620002420”, “610001223”, and“610001223”.

As a result, the numeral strings of the atoms of input numbers 7 and 8are minimum, so that the class number C₇ ⁰=C₈ ⁰=1 is given to theseatoms. Similarly, the class number C₂ ⁰=2 is given to the atom of inputnumber 2, and the class number C₄ ⁰=3 to the atom of input number 4.Also, the class number C₆ ⁰=4 is given to the atom of input number 6,and the class number C₁ ⁰=C₅ ⁰=5 to the atoms of input numbers 1 and 5.Further, the class number C₃ ⁰=6 is given to the atom of input number 3(see FIG. 30A). The atoms are classified into the six classes in thismanner, and thus the number No of classes is 6.

Next, the process of S913 is carried out to set the degree n to 1.

Further, the process of S914 is carried cut. As described previously,the attribute V_(ij) ^(1(n=1)) is given to each atom at S914. Here, theattribute V_(ij) ^(n) is the number of atoms bonding to an atom of inputnumber i and having a class number of j. Namely, describing it based onthe molecular structure diagram of FIG. 30B, input numbers of atomsbonding to the atom of input number 1 are 2, 6, 8, and the class numbersof these atoms are C₂ ⁰=2, C₆ ⁰=4, and C₈ ⁰=1. As a result, 1 is writtenin the attribute V_(1j) ¹ of j=1, 2, 4, thus obtaining V_(1j) ¹=(1, 1,0, 1, 0, 0).

Also, input numbers of atoms bonding to the atom of input number 2 are1, 3, and the class numbers of these atoms are C₁ ⁰=5 and C₃ ⁰=6. As aresult, 1 is written in the attribute V_(2j) ¹ of j=5, 6, thus obtainingV₂ _(J) ¹=(0, 0, 0, 0, 1, 1). The same processes for the atoms of inputnumbers 3 to 8 will result in obtaining V_(3j) ¹=(0, 1, 1, 0, 0, 0),V_(4j) ¹=(0, 0, 0, 1, 1, 0), V_(5j) ¹=(1, 0, 1, 1, 0, 0), V_(6j) ¹=(0,0, 0, 0, 2, 0), V_(7j) ¹=(0, 0, 0, 0, 1, 0), and V_(8j) ¹=(0, 0, 0, 0,1, 0).

Specifically, the attributes V_(ij) ⁰ are obtained using the referencetable T shown in FIGS. 24A and 24B. Extraction of attributes V_(ij) ¹using this reference table T is carried out in the order of i=1, i=2, .. . . First, attribute V_(1j) ¹ (i=1) is extracted. Extraction ofattribute V_(1j) ¹ (i=1) is to check the matrix elements satisfying x=1or y=1 (the matrix elements hatched in FIG. 24A) among the matrixelements of the reference table T and to extract suffixes S=(1, 2), (1,6), (1, 8) of the matrix elements with data written therein. Excludingi=1 from these suffixes S, we obtain S=2, 6, 8. Substituting the valuesof S thus obtained into the class number C_(i) ⁰, we obtain C₂ ⁰=2, C₆⁰=4, and C₈ ⁰=1. Then, 1 is written in the attribute V_(1j) ¹ of j=1, 2,4, thus obtaining V_(1j) ¹=(1, 1, 0, 1, 0, 0).

Next, the attribute V_(2j) ¹ (i=2) is extracted. The extraction ofattribute V_(2j) ¹ (i=2) is to check the matrix elements satisfying X=2or Y=2 (the matrix elements hatched in FIG. 24B) among the matrixelements of the reference table T and to extract suffixes S=(1, 2), (2,3) of the matrix elements with data written therein. Excluding i=2 fromthese suffixes S, we obtain S=1, 3. The values of S thus obtained aresubstituted into the class number C_(i) ⁰, thus obtaining C₁ ⁰=5 and C₃⁰=6. Then 1 is written in the attribute V_(2j) ¹ of j=5, 6, thusattaining V_(2j) ¹=(0, 0, 0, 0, 1, 1).

The same processes for i=3 to 8 will result in obtaining the attributesV_(ij) ¹ (i=1 to 8, j=1 to 6) shown in FIG. 31.

Next executed is the process of S915. As described previously, at S915the attributes (C_(i) ^(n−1), V_(ij) ^(n)) are arranged for each atom,and class numbers C_(i) ^(n) are given to the atoms in the ascendingorder of the numeral strings from the smallest, thus classifying theatoms into a plurality of classes.

Specifically, the numeral string of the atom of input number 1 is“5110100” and the numeral string of the atom of input number 2 is“2000011”. Following it in order, we obtain “6011000”, “3000110”,“5101100”, “4000020”, “1000010”, and “1000010”.

As a result, the numeral strings of the atoms of input numbers 7 and 8are minimum, and the class number C₇ ¹=C₈ ¹=1 is given to these atoms.Similarly, the class number C₂ ¹=2 is given to the atom of input number2, and the class number C₄ ¹=3 to the atom of input number 4. Further,the class number C₆ ¹=4 is given to the atom of input number 6, and theclass number C₅ ¹=5 to the atom of input number 5. Furthermore, theclass number C₁ ¹=6 is given to the atom of input number 1, and theclass number C₃ ¹=7 to the atom of input number 3. The atoms areclassified into the seven classes in this manner, and the number N₁ ofclasses is 7.

The process of S916 is next executed to check whether the number N_(n)of classes is equal to N_((n−1)), and the process is terminated ifequal. Also, whether the number N_(n) of classes is equal to the totalatom number is checked, and the process is terminated if equal. Here,since the number N₁ of classes is 7 and the number N₀ of classes is 6,N₁ is not equal to N₀. Also, since the total number of atoms is 8, thenumber N₁ of classes is not equal to the total number of atoms. Sinceneither is equal in this way, the process of S917 is executed to set nto 2.

Further, the process returns to S914 to give the attribute V_(ij) ² toeach atom. As a result, as shown in FIG. 32, we obtain V_(1j) ²=(1, 1,0, 1, 0, 0, 0), V_(2j) ²=(0, 0, 0, 0, 0, 1, 1), V_(3j) ²=(0, 1, 1, 0, 0,0, 0), V_(4j) ²=(0, 0, 0, 0, 1, 0, 1), V_(5j) ²=(1, 0, 1, 1, 0, 0, 0),V_(6j) ²=(0, 0, 0, 0, 1, 1, 0), V_(7j) ²=(0, 0, 0, 0, 1, 0, 0), andV_(8j) ²=(0, 0, 0, 0, 0, 1, 0).

Then the process of S915 is carried out to give the class number C_(i) ²to each atom. As a result, as shown in FIG. 30C, we obtain C₁ ²=7, C₂²=3, C₃ ²=8, C₄ ²=4, C₅ ²=6, C₆ ²=5, C₇ ²=2, and C₈ ²=1. The atoms areclassified into the eight classes in this manner, and the number N₂ ofclasses is 8. Since the number of classes N₂=8 is equal to the totalnumber of atoms, the process is terminated by determination at S916.

Next explained using the flowchart of FIG. 33 is the process ofcanonical number assignment routine 91 c called at S920 of FIG. 20.Here, a canonical number is a number of each atom uniquely determineddepending upon the structure of a compound. Namely, an input numbergiven by handwritten input of molecular structure diagram is anarbitrary number changing depending upon change of input order. Incontrast with it, the canonical data 82 is unique data depending only onthe structure of compound. Therefore, it is difficult to directly makethe unique canonical data 82 from the arbitrary input numbers. Thus, thecanonical data preparation program 91 enables smooth preparation ofcanonical data 82 by converting the input numbers once into canonicalnumbers and preparing the canonical data 82 based on the uniquecanonical numbers.

In the process of canonical number assignment routine 91 c, first, 1 isgiven to variable k (S921). Next, the final class numbers C₁ ^(f)obtained in the constituent atom classification routine 91 b arechecked, and a canonical number k (k=1 herein) is given to the atom withthe maximum class number (S922). If there are a plurality of maximumatoms, an arbitrary atom is selected out of these atoms, and thecanonical number k is given to this atom. After canonical numbers havebeen assigned to all atoms, then the process is terminated (S923).

Next, 1 is added to the variable k (S924), and, out of the atoms foreach of which the canonical number is decided (which will be referred toas decided atoms), a decided atom to which an atom for which a canonicalnumber is not decided (which will be referred to as an undecided atom)bonds is extracted (S925). Then whether there are plural decided atomsextracted is determined (S926), and if there are plural decided atomsextracted, a decided atom with the minimum canonical number is selectedout of these decided atoms (S927). Then an undecided atom with themaximum class number C_(i) ^(f) is extracted out of the undecided atomsbonding to the decided atoms thus selected, and the canonical number ofthis undecided atom is determined as k (S928). If there are pluraldecided atoms with the maximum class number C_(i) ^(f), an arbitrary oneis selected out of these decided atoms.

When one decided atom is determined at S926, an undecided atom with themaximum class number C_(i) ^(f) is selected out of the undecided atomsbonding to this decided atom and is given the canonical number k (S929).After completion of the processes of S928 and S929 the processingreturns to S923, and the loop of S923 to S929 is repeated until thecanonical numbers are assigned to the all atoms.

Next, the process of canonical number assignment routine 91 c isexplained with a specific example using3,5-dimethyl-2,3,4,5-tetrahydropyridine. First, 1 is given to thevariable k in the process of S921 and then the process of S922 iscarried out. In the process of S922, since the atom of input number 3has maximum C₃ ^(f)=8, the canonical number k=1 is given to the atom ofinput number 3. Next, the process of S924 is executed to change thevariable k to 2, and the process of S925 is then carried out to extractthe atom of input number 3 as a decided atom.

Since there is one decided atom thus extracted, the process of S929 isthen carried out. Since undecided atoms bonding to the atom of inputnumber 3 are the atoms of input numbers 2, 4, an atom with the maximumclass number C_(i) ^(f) is selected out of these atoms. Namely, theclass number of the atom of input number 2 is C₂ ^(f)=3, and the classnumber of the atom of input number 4 is C₄ ^(f)=4. Thus, the atom ofinput number 4 is selected, and the canonical number k=2 is given tothis atom.

Next, the flow returns to the process of S924 to change the variable kto 3, and the process of S925 is carried out to extract the atoms ofinput numbers 3, 4 as decided atoms. Since there are plural decidedatoms thus extracted, then the process of S927 is carried out to selectan atom with the minimum canonical number out of the decided atoms thusextracted. Namely, the canonical number of the atom of input number 3 is1 and the canonical number of the atom of input number 4 is 2. Thus, theatom of input number 3 is selected. Then the process of S928 is carriedout to give the canonical number k=3 to the atom of input number 2bonding to the atom of input number 3.

Further, the flow returns to the process of S924 to change the variablek to 4, and the process of S925 is carried out to extract the atoms ofinput numbers 2, 4 as decided atoms. Since there are plural decidedatoms thus extracted, then the process of S927 is carried out to selectan atom with the minimum canonical number out of the decided atoms thusextracted. Namely, the canonical number of the atom of input number 2 is3 and the canonical number of the atom of input number 4 is 2. Thus, theatom of input number 4 is selected. Then the process of S928 is carriedout to give the canonical number k=4 to the atom of input number 5bonding to the atom of input number 4.

Repeating the same processes, the canonical number 5 is assigned to theatom of input number 1 and the canonical number 6 to the atom of inputnumber 6, respectively. Also, the canonical number 7 is given to theatom of input number 7 and the canonical number 8 to the atom of inputnumber 8, respectively.

After that, the process of S923 is carried out, and because thecanonical numbers are obtained for the all atoms at this stage, theprocess is terminated. As a result, the canonical numbers as shown inFIG. 34 are obtained.

Next explained using the flowchart of FIG. 35 is the process ofcanonical data preparation routine 91 d called at S930. In this process,first, the input numbers are replaced by the canonical numbers, as shownin FIGS. 36A and 36B, to rewrite the bond table 81 (S931). Then, basedon this bond table 81, three types of data (P_(i), T_(i), S_(i)) isobtained for each atom (S932). Here, P_(i) is a canonical number of anatom bonding to an atom of canonical number i (i>1) and having a minimumnumber. Also, T_(i) is a symbol of type of bond between an atom ofcanonical number i (i>1) and an atom of canonical number P_(i) (—forsingle bond, ═ for double bond, # for triple bond, % for aromatic bond,and so on in this example). Further, S_(i) is a symbol for a type ofatom of canonical number i (i>0) (which is an element number in thiscase).

Specifically, first, an element number of the atom of canonical 1 ischecked with reference to the atomic table 81 g. This will result inobtaining S₁=“N”. Next, which atom bonds to the atom of canonical number2 is checked referring to the atomic pair table 81 h. As a result, theatoms of canonical numbers 1, 4 are obtained. Since the minimumcanonical number is 1 out of these atoms, P₂=1. Since the bond betweenthe atom of canonical number 2 and the atom of canonical number 1 is asingle bond, T₂=“—”. Further, S₂=“C” is obtained referring to the atomictable 81 g.

Next, which atom bonds to the atom of canonical number 3 is checkedreferring to the atomic pair table 81 h. As a result, the atoms ofcanonical numbers 1, 5 are attained. Since the minimum canonical numberis 1 among these atoms, P₃=1. Since the bond between the atom ofcanonical number 3 and the atom of canonical number 1 is a double bond,T₃=“=”. Further, referring to the atomic table 81 g, S₃=“C” is obtained.The same processes to follow obtain P₄=2, P₆=3, P₆=4, P₇=4, P₈=5, T₄ toT₈=“—”, and S₄ to S₈=“C”.

Next extracted is a bonding atom pair which was not referred to uponobtaining Ti in the process of S932 (S933). This process is carried outreferring to the atomic pair table 81 h. This will result in extractinga bonding atom pair of the atom of canonical number 5 and the atom ofcanonical number 6. Then three types of data (R¹ _(j), R² _(j), H_(j))are obtained for the bonding atom pair thus extracted (S934). Here, R¹_(j), R² _(j) are canonical numbers of two atoms constituting the bond.Also, H_(j) is a symbol for a type of the bond (the same symbols asT_(i) are used in this example). It is assumed that R¹ _(j) and R² _(j)satisfy the relation of R¹ _(j)>R² _(j). With another bonding atom pair(R¹ _(k), R² _(k)), they are supposed to satisfy the relation of R¹_(j)≦R¹ _(k) or the relation of R¹ _(j)=R¹ _(k) and R² _(j)<R² _(k).

The above processes prepared the canonical tree structure data shown inFIG. 37.

Next, the data obtained in the processes of S932 and S934 is aligned inline, thus preparing canonical data (S935). Namely, defining a delimiterF different from the symbols for the types of atom and for the types ofbond, the data obtained in the processes of S932 and S934 is aligned asfollows.

S₁, P₂, T₂, S₂, P₃, T₃, S₃, P₄, T₄, S₄, . . . , P_(N), T_(N), S_(N), F,R¹ ₁, H₁, R² ₁, F, R¹ ₂, H₂, R² ₂, . . . , F, R¹ _(M), H_(M), R² _(M), F

Here, N is the total number of atoms and M is the total number ofbonding atom pairs extracted at S934.

The data string thus obtained is canonical data uniquely correspondingto the structure of compound. Specifically, using “/” as the delimiterF, the obtained data is aligned in the predetermined order as follows.

“N1=C1=C2-C3-C4-C4-C5-C/5-6/”

Then this canonical data is written into the compound information file31 to be saved therein (S936). After that, the process is terminated.

The canonical data preparation means and method according to the presentinvention are not limited to the above embodiment, but may be modifiedwithin the scope not departing from the spirit of the present invention,for example as follows.

(1) The above embodiment used the data string including the symbols Sifor the types of atom as the canonical data, but the symbol for the typeof the atom with the highest frequency of occurrence (which is normallyC for carbon) may be excluded from the data string. Namely, omitting thesymbol for carbon C out of the above canonical data, we obtain thefollowing.

“N1-1=2-3-4-4-5-/5-6/”

Shortening the data string in this manner can reduce the quantity ofdata written into the compound information file 31.

(2) The following processes may be added to the canonical numberassignment routine 91 c in the case of a plurality of undecided atomswith the maximum class number C_(i) ^(f) being selected in the processof S929.

(a) If an undecided atom with the maximum class number C_(i) ^(f) doesnot belong to a cyclic structure portion, an arbitrary undecided atom isselected out of the plurality of undecided atoms and k is assigned as acanonical number of this undecided atom. After that, the processingreturns to S923.

(b) If an undecided atom with the maximum class number C_(i) ^(f)belongs to a cyclic structure portion, as to a structure obtained bycutting bonds between the undecided atoms selected at S929 (hereinafterreferred to as candidate atoms) and decided atoms bonding to thesecandidate atoms, the following vector quantity is defined for eachcandidate atom.

m_(ik): the minimum bond number between candidate atom i and atom withcanonical number k.

The order of priority is preliminarily determined as to this attribute,and an atom i with the highest priority order is selected and k isassigned as a canonical number of the atom. After that, the processreturns to S923.

Here, criteria of judgment of priority order in attribute values ofatoms are as follows. First, non-vector quantities depend upon thedegree of priority order. As for vector quantities, when elements of twovectors i, k are attributes V_(ij), V_(kj), the magnitude at minimum jamong the elements with V_(ij)≠V_(kj) is employed as a criterion ofjudgment of priority order. By employing such criteria of judgment,priority orders of the attributes b_(ij), d_(ij), V_(ij) ^(n), m_(ij)can be determined. In the case of priority orders being determined by aplurality of attributes, priority orders are preliminarily determinedamong the attribute, and priority is given to judgment in an attributewith a higher priority order.

The above canonical data preparation method according to the presentinvention was used to obtain the canonical data of C₆₀ molecule shown inFIG. 38A, and the canonical data (FIG. 38B) for uniquely specifying thestructure of the C₆₀ molecule was obtained just in 1.5 seconds. To thecontrary, when the canonical data of the C₆₀ molecule was obtained usingan information processing apparatus of same performance by the Morganalgorithm without intervention of the process for classifying the atomsinto equivalent atoms, 550 seconds were needed to achieve the canonicaldata. Therefore, if the above canonical data preparation means andmethod according to the present invention are employed in the presentinvention, the speed of the biochemical information processing accordingto the present invention can be improved remarkably.

The foregoing explained the preferred embodiment of the biochemicalinformation processing apparatus and method of the present invention,but it should be understood that the present invention is not limited tothe above embodiment.

For example, the canonical data preparation means (the canonical datapreparation program 91) according to the present invention does not haveto be incorporated together with the other means (the reaction schemedetection program 25 etc.) in the first storage device in thebiochemical information processing apparatus of the present invention,but, as shown in FIG. 39 and FIG. 40, the canonical data preparationmeans (the canonical data preparation program 91) according to thepresent invention and the other means (the reaction scheme detectionprogram 25 etc.) may exist separately from each other in the firststorage device 20.

Also, the biochemical information processing apparatus of the presentinvention does not have to comprise all of the reaction scheme detectionmeans (the reaction scheme detection program) 25, the receptorinformation detection means (the receptor information detection program)26, and the reaction path detection means (the reaction path detectionprogram) 27, but the apparatus may be arranged, for example, to beprovided with the reaction scheme detection means (the reaction schemedetection program) 25 and the reaction path detection means (thereaction path detection program) 27, as shown in FIG. 41, or to beprovided with only either one of them. In this case, the receptorinformation file 36 is not necessary, and the mutual relation betweenthe compound numbers C₁-C₇ and the enzyme numbers E₁-E₆ described in thereaction path diagram of FIG. 2 is recorded in the relation informationfile 33 shown in FIG. 42. Describing in more detail, the enzyme numbersE₁-E₆ of the enzymes with each compound of compound number C₁-C₆ being asubstrate, the enzyme numbers E₁-E₆ of the enzymes with each compound ofcompound number C₂-C₇ being a product, and the enzyme number E₄ of theenzyme inhibited by the compound of compound number C₆ are recorded inthe form of a list corresponding to the compound numbers C₁-C₇.Therefore, when access is made to the relation information file 33 usingthe compound number C₁-C₇ as a key, the apparatus can read out theenzyme numbers E₁-E₆ of the enzymes with each compound of compoundnumber C₁-C₇ being a substrate or a product, and the enzyme number E₄ ofthe enzyme inhibited by the compound of compound number C₆. The mainprogram 23 in this case is the same as FIG. 11 except that it excludesstep S115 for calling the receptor information indication program, asshown in FIG. 43.

Next explained is a biochemical information computer program product(recording medium) according to an embodiment of the present invention.

FIG. 44 is a block diagram to show the structure of the biochemicalinformation computer product (recording medium) 2 according to theembodiment of the present invention. As shown in the drawing, thebiochemical information recording medium 2 of the present embodimentcomprises a file area 2 b for recording files, and a program area 2 afor recording programs. Recorded in the file area 2 b are a compoundinformation file 31, an enzyme information file 32, a relationinformation file 33, a partial correlation data file 34, a bond tablefile 35, and a receptor information file 36.

Among them, the compound information file 31 stores a list showing therelation between compound numbers of compounds and canonical datacorresponding to the compounds, and additional information (alsoreferred to as reference data) about the compounds. The enzymeinformation file 32 stores a list showing the relation among enzymenumbers of enzymes, compound numbers of compounds being substrates forthe enzymes, and compound numbers of compounds being products by theenzymes, and additional information about the enzymes.

Further, the relation information file 33 stores a list showing therelation among compound numbers of compounds, enzyme numbers of enzymeswith a relevant compound being a substrate, enzyme numbers of enzymeswith a relevant compound being a product, receptor numbers of receptorswith a relevant compound being an agonist, and receptor numbers ofreceptors with a relevant compound being an antagonist. Furthermore, thepartial correlation data file 34 is prepared to store the reaction pathinformation, and the bond table file 35 to store the bond table data,respectively. Moreover, the receptor information file 36 stores a listshowing the relation among receptor numbers of receptor, compoundnumbers of compounds being agonists for the receptors, compound numbersof compounds being antagonists for the receptors, and additionalinformation about the receptors.

The biochemical information processing program 22 is recorded in theprogram area 2 a. The biochemical information processing program 22comprises the main program 23 for generally controlling the processing,the three-dimensional indication program 24 for three-dimensionallydisplaying the image data, the reaction scheme detection program 25 fordetecting a chemical reaction scheme between compounds, the receptorinformation detection program 26 for detecting the additionalinformation about receptor, and the reaction path detection program 27for detecting a reaction path of plural compounds. The reaction schemedetection program 25 comprises the first process routine 25 a to thefourth process routine 25 d, the receptor information detection program26 does the fifth process routine 26 a to the eighth process routine 26d, and the reaction path detection program 27 the ninth process routine27 a to the thirteenth process routine 27 e.

A disk type recording medium, for example, such as a flexible disk or aCD-ROM, is used as the biochemical information recording medium 2. Also,a tape type recording medium such as a magnetic tape may be applied.

The biochemical information recording medium 2 of the present embodimentcan be used in the information processing apparatus 1 shown in FIG. 45and FIG. 46. In detail, the information processing apparatus 1 has amedium drive device 3 and the biochemical information recording medium 2can be loaded in the medium drive device 3. Then this loading enablesaccess to the biochemical information recorded in the biochemicalinformation recording medium 2 by the medium drive device 3. This makesit possible to carry out the biochemical information processing program22 recorded in the program area 20 by the information processingapparatus 1.

The structure of this information processing apparatus 1 is as follows.First, it is provided with the above-described medium drive device 3,the image memory 10 for storing the image data indicating the molecularstructure diagram or the like of compound, the work memory (innermemory) 11 with resident operating system (OS), and the display 40 asdisplay means. Also, it is provided with the input device 50 being inputmeans having the mouse 51 for accepting input of image data and thekeyboard 52 for accepting input of symbolic data, the printer 60 foroutputting the image data or the like, and the CPU 70 for controllingexecution or the like of the biochemical information processing program22.

The medium drive device 3 applied is a flexible disk drive device, aCD-ROM drive device, a magnetic tape drive device, or the like,depending upon the biochemical information recording medium 2.

The detailed structure of the compound information file 31, enzymeinformation file 32, relation information file 33, partial correlationdata file 34, bond table file 35, and receptor information file 36recorded in the biochemical information recording medium 2 of thepresent embodiment is as described previously (FIG. 2 to FIG. 6).

The flow of data in the information processing apparatus 1 is also asdescribed previously, the image data 80 input is converted into eitherof the bond table data 81, canonical data 82, and three-dimensional data83 to be used, and the canonical data 82 is mainly used in thebiochemical information program 22 recorded in the program area 2 a,which is also as described previously (FIG. 7 to FIG. 10).

Next explained is the process of biochemical information processingprogram 22 recorded in the program area 2 a of the biochemicalinformation recording medium 1. This process is carried out by executingthe biochemical information processing program 22 read out by the mediumdrive device 3. This execution first starts the main program 23 of thebiochemical information processing program 22.

The details of the processes of main program 23, three-dimensionalindication program 24, reaction scheme detection program 25, reactionpath detection program 27, and receptor information detection program 26thereafter are also as described previously (FIG. 11 to FIG. 15), and,for example as shown in FIG. 16 and FIG. 17, reaction scheme data or thelike is indicated on the display 40.

Next explained is the canonical data preparation program suitablyapplicable to the present invention.

The biochemical information computer program product (recording medium)2, being the embodiment of the present invention and shown in FIG. 44,is provided with the canonical data preparation program according to thepresent invention; that is, the biochemical information recording medium2 is provided with the file area 2 b for storing files and the programarea 2 a for storing programs. The bond table file 35, compoundinformation file 31, etc. are stored in the file area 2 b.

A plurality of bond tables 81 can be recorded in the bond table file 31.Recorded in a bond table 81 is characteristic data about each of atomsconstituting a compound and bond pair data between atoms, and thecanonical data preparation program 91 can access these data through thebond table 81.

The compound information file 31 and bond table 81 are as describedpreviously (FIG. 3, FIG. 18A, and FIG. 18B).

The canonical data preparation program 91 is stored in the program area2 a. The canonical data preparation program 91 is a program forpreparing the canonical data, based on the characteristic data abouteach of the atoms constituting the compound and the bond pair databetween atoms. This canonical data preparation program 91 comprises themain routine 91 a for generally controlling the processes and theconstituent atom classification routine 91 b for assigning a classnumber to each of atoms constituting a compound. The canonical datapreparation program 91 also comprises the canonical number assignmentroutine 91 c for assigning a canonical number to each atom, based on theclass numbers, and the canonical data preparation routine 91 d forpreparing the canonical data based on the canonical numbers of therespective atoms.

The biochemical information recording medium 2 can be utilized in theinformation processing apparatus 1 shown in FIG. 45, as describedpreviously. Pointing devices other than the mouse 51 include a tablet, adigitizer, a light pen, and so on, and the mouse 51 may be replaced byeither one of these devices.

The schematic operation of the information processing apparatus 1 isalso as described previously.

Next explained is the process of canonical data preparation program 91stored in the program area 2 a of biochemical information recordingmedium 2. This process is carried out by executing the canonical datapreparation program 91 read out by the medium drive device 3. Thisexecution first starts the main routine 91 a of the canonical datapreparation program 91.

The details of the processes of main routine 91 a, constituent atomclassification routine 91 b, canonical number assignment routine 91 c,and canonical data preparation routine 91 d after that are also asdescribed previously (FIG. 20 to FIG. 37), and the canonical data foruniquely specifying a compound can be attained in a short time.

The foregoing described the preferred embodiment of the biochemicalinformation computer program product (recording medium) of the presentinvention, but it is noted that the present invention is not limited tothe above embodiment.

For example, the canonical data preparation program 91 according to thepresent invention does not have to be present together with thebiochemical information processing program 22 according to the presentinvention in a single medium, but the canonical data preparation program91 and biochemical information processing program 22 according to thepresent invention may be recorded respectively in separate media, asshown in FIG. 47 and FIG. 48.

Namely, as shown in FIG. 48, the canonical data preparation program 91according to the present invention may be singly formed as a storagemedium 2 for preparation of canonical data. In this case, the storagemedium 2 for preparation of canonical data can be utilized by theinformation processing apparatus 1 shown in FIG. 49. Namely, theinformation processing apparatus 1 is provided with the medium drivedevice 3, and the storage medium 2 for preparation of canonical data canbe loaded in this device 3. Then this loading enables the medium drivedevice 3 to access the information stored in the storage medium 2 forpreparation of canonical data. This enables the information processingapparatus 1 to carry out the canonical data preparation program 91stored in the program area 2 a. The storage medium 2 for preparation ofcanonical data applicable is, for example, a disk type storage mediumsuch as a flexible disk or a CD-ROM, or a tape type storage medium suchas a magnetic tape.

The biochemical information computer program product (recording medium)of the present invention does not have to comprise all of the reactiondetection program 25, receptor information detection program 26, andreaction path detection program 27, but may be arranged, for example asshown in FIG. 50, to comprise the reaction scheme detection means (thereaction scheme detection program) 25 and the reaction path detectionmeans (the reaction path detection program) 27, or may be arranged tocomprise only either one of them. In this case, the receptor informationfile 36 is not necessary, and the main program 23 in this case is thesame as that shown in FIG. 11 except that it excludes step S115 forcalling the receptor information indication program, as shown in FIG.43.

Without having to be limited to the above embodiments, the presentinvention can have a variety of modifications. For example, an aminoacid sequence for defining the structure of enzyme, or a base sequencemay be recorded in the column of reference data in the enzymeinformation file 32. Similarly, an amino acid sequence for defining thestructure of receptor, or a base sequence may be recorded in the columnof reference data in the receptor information file 36. Recording thesesequences in the reference data makes possible utilization in connectionwith genetic information.

An anomaly in a function of a specific enzyme could cause a diseasecalled as dysbolism. Thus, information about abnormal enzyme may berecorded in the column of reference data in the enzyme information file32 to be used for search of dysbolism.

Further, the compound information file 31, enzyme information file 32,and relation information file 33 may include a record of information ofconversion of foreign material occurring when a living body is dosedwith the foreign material (which is a material not existing in livingbodies originally).

Furthermore, the compound information file 31, enzyme information file32, and relation information file 33 may include a record of informationconcerning production or conversion of substance by enzyme ormicro-organism.

Furthermore, many drugs and agricultural chemicals themselves are enzymeinhibitors, agonists (agonistic materials), or antagonists (antagonisticmaterials). Then information about structures of drugs and agriculturalchemicals or related information may be recorded as bio-relatedsubstances in the compound information file 31.

Yet further, information concerning safety, such as toxicity of chemicalsubstance, may be recorded in the column of reference data in thecompound information file 31 and may be used in connection with behaviorof substance in a living body system.

Yet further, information in the field of nutrition may be recorded inthe column of reference data of compound information file 31.

Furthermore, the indication method of reaction path may be modified, forexample, in such a manner that the overall reaction path diagram ispreliminarily prepared to be indicated in arbitrary position and scaleand a desired reaction path part can be indicated by scrolling thescreen top to bottom or left to right. The search of compound may adoptsearch by partial structure (partial identify search), search based onsimilarity, or the like. Further, the search of reaction path may bedirected to a specific compound group, for example, such as metabolismof steroid.

The present processing apparatus or the present processing method mayalso be used as a compound database system, and each information of thecompound database system may be recorded in the medium of the presentinvention. In this case, it is possible to perform search based oncompound data of values of physical properties or the like. Based on thethree-dimensional structure data of compound, a theoretical chemistrycalculation function, such as calculation of molecular orbit orcalculation of molecular force field, may be added to the presentprocessing apparatus or the present processing method. Using the presentprocessing apparatus or the present processing method, one can also knowa reaction path when a specific enzyme is inhibited or inactivated orwhen an enzyme is defective.

Furthermore, the biochemical information recording medium of the presentinvention may include a record of information for knowing the reactionpath when a specific enzyme is inhibited or inactivated or when anenzyme is defective.

INDUSTRIAL APPLICABILITY

As detailed above, the biochemical information processing apparatus andbiochemical information processing method of the present invention canefficiently perform detection of reaction scheme, detection of receptorinformation, and detection of reaction path. Also, use of thebiochemical information recording medium of the present inventionenables to efficiently perform the detection of reaction scheme,detection of receptor information, and detection of reaction path.

In the detection of reaction scheme, first, reference is made to thelist stored in the compound information file to read out a compoundnumber corresponding to canonical data. Then, based on this compoundnumber, reference is made to the relation information file to read outan enzyme number of an enzyme with this compound being a substrate or aproduct. Further, based on this enzyme number, reference is made to theenzyme information file to read out information about this enzyme. Thena chemical reaction scheme involving this compound is obtained from theinformation about the compound and enzyme thus read out.

In this way, by mutual reference to the compound information file,enzyme information file, and relation information file, variousinformation can be efficiently acquired for an enzyme with a compoundbeing a substrate or a product even in the case of the structure of thecompound being used as a key.

Particularly, since the relation information file stores the listshowing the relationship between compounds and enzymes with thecompounds being substrates or products, it is easy to search for therelationship among a compound being a substrate, a compound being aproduct, and an enzyme for changing the substrate to the product,whereby a chemical reaction scheme can be attained efficiently

In the detection of receptor information, first, reference is made tothe list stored in the compound information file to read out a compoundnumber corresponding to canonical data. Next, based on this compoundnumber, reference is made to the relation information file to read out areceptor number of a receptor with this compound being an agonist or anantagonist. Further, based on this receptor number, reference is made tothe reference information file to read out the additional informationabout this receptor. Then the additional information about the receptorthus read out is indicated on the display means.

In this way, by mutual reference to the compound information file,receptor information file, and relation information file, variousinformation can be acquired efficiently for a receptor with a compoundbeing an agonist or an antagonist even in the case of the structure ofthe compound being used as a key.

Particularly, since the relation information file stores the listshowing the relationship between compounds and receptors with thecompounds being agonists or antagonists, it is easy to search for therelationship among a compound being an agonist, a compound being anantagonist, and a receptor, whereby various information about thereceptor can be obtained efficiently.

Further, in the detection of reaction path, first, reference is made tothe list stored in the compound information file to read out a compoundnumber corresponding to canonical data. Next, based on this compoundnumber, reference is made to the relation information file to read outeach of an enzyme number of an enzyme with this compound being asubstrate and an enzyme number of an enzyme with this compound being aproduct. Further, based on these enzyme numbers, reference is made tothe enzyme information file to read out a compound number of a compoundbeing a substrate and a compound number of a compound being a productfor every enzyme. Reading from the relation information file and theenzyme information file is repetitively carried out. Then, from aplurality of compound numbers and a plurality of enzyme numbers thusread out, a reaction path of these compounds is obtained.

In this way, by mutual reference to the compound information file,enzyme information file, and relation information file, it is possibleto efficiently search a reaction path involving a plurality ofcompounds.

Particularly, since the relation information file stores the listshowing the relationship between compounds and enzymes with thecompounds being substrates or products, it is easy to search for therelationship among a compound being a substrate, a compound being aproduct, and an enzyme for changing the substrate to the product,whereby a reaction path involving a plurality of compounds can beobtained efficiently.

Further, employing the canonical data preparation means (the canonicaldata preparation program) according to the present invention, thecharacteristic data about each atom and the bonding pair data betweenatoms, accepted through the input means, is given to the canonical datapreparation means. Then the canonical data preparation means preparesthe canonical data based on these data within a short time. Also, by thecanonical data preparation method according to the present invention,the canonical data is prepared within a short time, based on thecharacteristic data about each of atoms constituting a compound and thebonding pair data between atoms. As described, the canonical dataprepared by the canonical data preparation means (the canonical datapreparation program) and the canonical data preparation method accordingto the present invention is a very short string of character, numeral,and symbol, and the canonical data can be saved within a small storagearea. Therefore, if the canonical data preparation means (the canonicaldata preparation program) and the canonical data preparation methodaccording to the present invention are utilized in a compound/reactiondatabase system, a use amount of storage area in the compound/reactiondatabase system can be decreased remarkably.

What is claimed is:
 1. A biochemical information processing apparatuscomprising: storage means for storing biochemical information aboutcompounds and enzymes; input means for accepting input of image dataindicating said biochemical information or symbolic data indicating saidbiochemical information; reaction scheme detection means for, when saidinput means accepts data about a compound being a substrate and/or aproduct, detecting a chemical reaction scheme involving said compoundbased on the data characterizing the compound as at least one of asubstrate and a product; and display means for indicating a reactionscheme diagram of the chemical reaction scheme; wherein said storagemeans comprises: a compound information file storing a list showing arelation between compound numbers of compounds and canonical datacorresponding to said compounds, and additional information about saidcompounds, an enzyme information file storing a list showing a relationamong enzyme numbers of enzymes, compound numbers of compounds beingsubstrates for said enzymes, and compound numbers of compounds beingproducts by said enzymes, and additional information about said enzymes,and a relation information file storing a list showing a relation amongcompound numbers of compounds as a key, enzyme numbers of enzymes withsaid compound being a substrate, and enzyme numbers of enzymes with saidcompound being a product; and wherein said reaction scheme detectionmeans comprises: a first process portion for preparing from the dataabout a compound accepted through said input means said canonical datauniquely indicating a chemical structure of said compound, furthersearching said compound information file, based on the canonical data,and reading out a compound number corresponding to said canonical datawhen said canonical data exists in said compound information file, asecond process portion for reading an enzyme number of an enzyme withthe compound being a substrate or a product out of said relationinformation file, based on the compound number read out in said firstprocess portion, a third process portion for reading a compound numberof another compound constituting a reaction system together with theenzyme of the enzyme number read out in said second process portion andthe compound of the compound number read out in said first processportion, and additional information about said enzyme out of said enzymeinformation file, and a fourth process portion for indicating a reactionscheme diagram of the compound whose image or symbolic data was acceptedthrough said input means on said display means from the compound numberread out in said first process portion, the enzyme number read out insaid second process portion, and the compound number of the anothercompound read out in said third process portion, and further indicatingthe additional information about the enzyme read out in said thirdprocess portion on said display means.
 2. The biochemical informationprocessing apparatus according to claim 1, said biochemical informationprocessing apparatus further comprising receptor information detectionmeans for, when said input means accepts data about a compound,detecting additional information about a receptor based on the data withsaid compound being an agonist and/or an antagonist, wherein saidstorage means further stores biochemical information about receptors,and further comprises a receptor information file storing a list showingthe relation between receptor numbers of receptors and compound numbersof compounds being agonists and/or antagonists for said receptors, andadditional information about said receptors, wherein said relationinformation file stores a list to show the relation among the compoundnumbers of the compounds as a key, the enzyme numbers of the enzymeswith said compound being a substrate, the enzyme numbers of the enzymeswith said compound being a product, the receptor numbers of thereceptors with said compound being an agonist, and the receptor numbersof the receptors with said compound being an antagonist; and whereinsaid receptor information detection means comprises: a fifth processportion for preparing from data about a compound accepted through saidinput means said canonical data uniquely indicating a chemical structureof said compound, further searching said compound information file,based on said canonical data, and reading out a compound numbercorresponding to said canonical data when said canonical data exists insaid compound information file, a sixth process portion for reading,based on the compound number read out in said fifth process portion, areceptor number of a receptor with the compound being an agonist or anantagonist out of said relation information file, a seventh processportion for reading at least additional information about the receptorof the receptor number read out in said sixth process portion out ofsaid receptor information file, and an eighth process portion forindicating at least the additional information about the receptor readout in said seventh process portion on said display means.
 3. Thebiochemical information processing apparatus according to claim 1, saidbiochemical information processing apparatus further comprising reactionpath detection means for, when said input means accepts data about apredetermined compound selected from a plurality of compoundsconstituting a predetermined reaction path, detecting the predeterminedreaction path of said plurality of compounds based on the data about thepredetermined compound; wherein said reaction path detection meanscomprises: a fifth process portion for preparing from the data about thepredetermined compound accepted through said input means said canonicaldata uniquely indicating a chemical structure of said predeterminedcompound, further searching said compound information file, based on thecanonical data, and reading out a compound number corresponding to saidcanonical data when said canonical data exists in said compoundinformation file, a sixth process portion for reading, based on thecompound number read out in said fifth process portion, an enzyme numberof an enzyme with the predetermined compound being a substrate and anenzyme number of an enzyme with the predetermined compound being aproduct out of said relation information file, a seventh process portionfor reading, based on each enzyme number read out in said sixth processportion, a compound number of a compound being a substrate for saidenzyme and a compound number of a compound being a product by saidenzyme out of said enzyme information file, an eighth process portionfor repeating said sixth process portion and said seventh processportion to obtain compounds and enzymes within the predeterminedreaction path, and a ninth process portion for indicating from enzymenumbers read out in said sixth process portion and compound numbers readout in said seventh process portion a reaction scheme diagram of saidplurality of compounds along the predetermined reaction path on saiddisplay means.
 4. The biochemical information processing apparatusaccording to claim 1, wherein said input means accepts input ofcharacteristic data about each of the atoms constituting a compound andbonding pair data between the atoms, wherein said biochemicalinformation processing apparatus further comprises canonical datapreparation means for preparing canonical data to uniquely specify achemical structure of said compound, based on characteristic or bondingpair data accepted through said input means; and wherein said canonicaldata preparation means comprises: a constituent atom classificationprocess portion for classifying, based on the characteristic or bondingpair data accepted through said input means, the atoms into differentclasses each for equivalent atoms and assigning, to each atom, adifferent class number for each class, a canonical number assignmentprocess portion for assigning canonical numbers uniquely correspondingto the structure of said compound to the respective atoms, based on theclass numbers assigned to the respective atoms in said constituent atomclassification process portion, and a canonical data preparation processportion for preparing said canonical data, based on the canonicalnumbers assigned to the respective atoms in said canonical numberassignment process portion.
 5. The biochemical information processingapparatus according to claim 4, wherein said constituent atomclassification process portion assigns three types of attributes (a_(i),b_(ij), d_(ij)) to each atom and, utilizing the fact that atomsdifferent in even only one of these attributes can be determined to benot equivalent, assigns a different class number for each equivalentatom to each atom, where among said three types of attributes (a_(i),b_(ij), d_(ij)), a_(i) is a kind number of an atom of input number i,b_(ij) is the number of bonds adjoining the atom of input number i andhaving a bond kind number being j, and d_(ij) is the number of routesthat can be traced from the atom of input number i through j bonds inthe shortest path, wherein said canonical number assignment processportion is arranged so that when in a process for assigning a canonicalnumber to each atom in the ascending order from 1 the canonical number 1is given to an atom with a highest priority of said class number andthereafter canonical numbers up to the canonical number n are assigned,said canonical number assignment process portion selects an atom with aminimum canonical number out of atoms already having their respectivecanonical numbers and bonding to an atom having no canonical number yetand then gives a canonical number n+1 to an atom with a highest priorityof said class number out of atoms bonding to a selected atom and havingno canonical number yet, and wherein said canonical data preparationprocess portion gives three types of attributes (P_(i), T_(i), S_(i)) toeach atom and aligns these attributes in line to prepare said canonicaldata, where among said three types of attributes (P_(i), T_(i), S_(i)),P_(i) is a canonical number of an atom bonding to an atom of canonicalnumber i and having a minimum canonical number, T_(i) is a symbol for atype of a bond between the atom of canonical number i and the atom ofcanonical number P_(i), and S_(i) is a symbol for a kind of the atom ofcanonical number i.
 6. A biochemical information processing apparatuscomprising: storage means for storing biochemical information aboutcompounds and enzymes; input means for accepting input of image dataindicating said biochemical information or symbolic data indicating saidbiochemical information; reaction path detection means for, when saidinput means accepts data about a predetermined compound selected from aplurality of compounds constituting a predetermined reaction path,detecting the predetermined reaction path of said plurality of compoundsbased on the data about the predetermined compound; and display meansfor indicating a reaction scheme diagram of a chemical reaction scheme;wherein said storage means comprises: a compound information filestoring a list showing the relation between compound numbers ofcompounds and canonical data corresponding to said compounds, andadditional information about said compounds, an enzyme information filestoring a list showing the relation among enzyme numbers of enzymes,compound numbers of compounds being substrates for said enzymes, andcompound numbers of compounds being products by said enzymes, andadditional information about said enzymes, and a relation informationfile storing a list showing the relation among compound numbers ofcompounds as a key, enzyme numbers of enzymes with said predeterminedcompound being a substrate, and enzyme numbers of enzymes with saidpredetermined compound being a product; and wherein said reaction pathdetection means comprises: a first process portion for preparing fromthe data about the predetermined compound accepted through said inputmeans said canonical data uniquely indicating a chemical structure ofsaid predetermined compound, further searching said compound informationfile, based on the canonical data, and reading out a compound numbercorresponding to said canonical data when said canonical data exists insaid compound information file, a second process portion for reading,based on the compound number read out in said first process portion, anenzyme number of an enzyme with the predetermined compound being asubstrate and an enzyme number of an enzyme with the predeterminedcompound being a product out of said relation information file, a thirdprocess portion for reading, based on each enzyme number read out insaid second process portion, a compound number of a compound being asubstrate for said enzyme and a compound number of a compound being aproduct by said enzyme out of said enzyme information file, a fourthprocess portion for repeating said second process portion and said thirdprocess portion to obtain compounds and enzymes within the predeterminedreaction path, and a fifth process portion for indicating from enzymenumbers read out in said second process portion and compound numbersread out in said third process portion a reaction scheme diagram of saidplurality of compounds along the predetermined reaction path on saiddisplay means.
 7. The biochemical information processing apparatusaccording to claim 6, said biochemical information processing apparatusfurther comprising receptor information detection means for, when saidinput means accepts data about a predetermined compound, detectingadditional information about a receptor with said predetermined compoundbeing an agonist and/or an antagonist based on the data about apredetermined compound; wherein said storage means further storesbiochemical information about receptors, and further comprises areceptor information file storing a list showing the relation betweenreceptor numbers of receptors and compound numbers of compounds beingagonists and/or antagonists for said receptors, and additionalinformation about said receptors, wherein said relation information filestores a list to show the relation among the compound numbers of thecompounds as a key, the enzyme numbers of the enzymes with saidpredetermined compound being a substrate, the enzyme numbers of theenzymes with said predetermined compound being a product, the receptornumbers of the receptors with said predetermined compound being anagonist, and the receptor numbers of the receptors with saidpredetermined compound being an antagonist; and wherein said receptorinformation detection means comprises: a sixth process portion forpreparing from data about the predetermined compound accepted throughsaid input means said canonical data uniquely indicating a chemicalstructure of said predetermined compound, further searching saidcompound information file, based on said canonical data, and reading outa compound number corresponding to said canonical data when saidcanonical data exists in said compound information file, a seventhprocess portion for reading, based on the compound number read out insaid sixth process portion, a receptor number of a receptor with thepredetermined compound being an agonist or an antagonist out of saidrelation information file, an eighth process portion for reading atleast additional information about the receptor of the receptor numberread out in said seventh process portion out of said receptorinformation file, and a ninth process portion for indicating at leastthe additional information about the receptor read out in said eighthprocess portion on said display means.
 8. The biochemical informationprocessing apparatus according to claim 6, wherein said input meansaccepts input of characteristic data about each of the atomsconstituting a compound and bonding pair data between the atoms, whereinsaid biochemical information processing apparatus further comprisescanonical data preparation means for preparing canonical data touniquely specify a chemical structure of said compound, based on thecharacteristic or bonding pair data accepted through said input means;and wherein said canonical data preparation means comprises: aconstituent atom classification process portion for classifying, basedon each characteristic or bonding pair data accepted through said inputmeans, the atoms into different classes each for equivalent atoms andassigning, to each atom, a different class number for each class, acanonical number assignment process portion for assigning canonicalnumbers uniquely corresponding to the structure of said compound to therespective atoms, based on the class numbers assigned to the respectiveatoms in said constituent atom classification process portion, and acanonical data preparation process portion for preparing said canonicaldata, based on the canonical numbers assigned to the respective atoms insaid canonical number assignment process portion.
 9. The biochemicalinformation processing apparatus according to claim 8, wherein saidconstituent atom classification process portion assigns three types ofattributes (a_(i), b_(ij), d_(ij)) to each atom and, utilizing the factthat atoms different in even only one of these attributes can bedetermined to be not equivalent, assigns a different class number foreach equivalent atom to each atom, where among said three types ofattributes (a_(i), b_(ij), d_(ij)), a_(i) is a kind number of an atom ofinput number i, b_(ij) is the number of bonds adjoining the atom ofinput number i and having a bond kind number being j, and d_(ij) is thenumber of routes that can be traced from the atom of input number ithrough j bonds in the shortest path, wherein said canonical numberassignment process portion is arranged so that when in a process forassigning a canonical number to each atom in the ascending order from 1the canonical number 1 is given to an atom with a highest priority ofsaid class number and thereafter canonical numbers up to the canonicalnumber n are assigned, said canonical number assignment process portionselects an atom with a minimum canonical number out of atoms alreadyhaving their respective canonical numbers and bonding to an atom havingno canonical number yet and then gives a canonical number n+1 to an atomwith a highest priority of said class number out of atoms bonding to aselected atom and having no canonical number yet, and wherein saidcanonical data preparation process portion gives three types ofattributes (P_(i), T_(i), S_(i)) to each atom and aligns theseattributes in line to prepare said canonical data, where among saidthree types of attributes (P_(i), T_(i), S_(i)), P_(i) is a canonicalnumber of an atom bonding to an atom of canonical number i and having aminimum canonical number, T_(i) is a symbol for a type of a bond betweenthe atom of canonical number i and the atom of canonical number P_(i),and S_(i) is a symbol for a kind of the atom of canonical number i. 10.A biochemical information processing method, comprising: providing aninformation processing apparatus having input means for accepting inputof image data indicating said biochemical information or symbolic dataindicating said biochemical information, display means for indicating areaction scheme diagram of a chemical reaction scheme, and storage meansfor storing biochemical information about compounds and enzymesincluding a compound information file storing a list showing therelation between compound numbers of compounds and canonical datacorresponding to said compounds, and additional information about saidcompounds, an enzyme information file storing a list showing therelation among enzyme numbers of enzymes, compound numbers of compoundsbeing substrates for said enzymes, and compound numbers of compoundsbeing products by said enzymes, and additional information about saidenzymes, and a relation information file storing a list showing therelation among compound numbers of compounds as a key, enzyme numbers ofenzymes with a predetermined compound being a substrate, and enzymenumbers of enzymes with said predetermined compound being a product; ina first step, when said input means accepts data about a compound beinga substrate and/or a product, preparing said canonical number datauniquely indicating a chemical structure of said compound from the datacharacterizing the compound as at least one of a substrate and aproduct, further searching said compound information file based on thecanonical data, and reading out a compound number corresponding to saidcanonical data when said canonical data exists in said compoundinformation file; in a second step, reading an enzyme number of anenzyme with the compound being a substrate or a product out of saidrelation information file, based on the compound number read out in saidfirst step; in a third step, reading a compound number of anothercompound constituting a reaction system together with the enzyme of theenzyme number read out in said second step and said compound of thecompound number read out in said first step, and additional informationabout said enzyme out of said enzyme information file; and in a fourthstep, indicating a reaction scheme diagram of the compound whose imageor symbolic data was accepted through said input means on said displaymeans from the compound number read out in said first step, the enzymenumber read out in said second step, and the compound number of theanother compound read out in said third step, and further indicating theadditional information about the enzyme read out in said third step onsaid display means.
 11. The biochemical information processing methodaccording to claim 10, further comprising: further providing in saidstorage means stored biochemical information about receptors, and areceptor information file storing a list showing a relation betweenreceptor numbers of receptors and compound numbers of compounds beingagonists and/or antagonists for said receptors, and additionalinformation about said receptors, said relation information file storinga list to show the relation among the compound numbers of the compoundsas a key, the enzyme numbers of the enzymes with said predeterminedcompound being a substrate, the enzyme numbers of the enzymes with saidpredetermined compound being a product, the receptor numbers of thereceptors with said predetermined compound being an agonist, and thereceptor numbers of the receptors with said predetermined compound beingan antagonist; in a fifth step, when said input means accepts data aboutthe compound, preparing said canonical data uniquely indicating achemical structure of said compound, further searching said compoundinformation file, based on said canonical data, and reading out acompound number corresponding to said canonical data when said canonicaldata exists in said compound information file; in a sixth steps,reading, based on the compound number read out in said fifth step, areceptor number of a receptor with the compound being an agonist or anantagonist out of said relation information file; in a seventh step,reading at least additional information about the receptor of thereceptor number read out in said sixth step out of said receptorinformation file; and in an eighth step, indicating at least theadditional information about the receptor read out in said seventh stepon said display means.
 12. The biochemical information processing methodaccording to claim 10, said biochemical information processing methodfurther comprising: in a fifth step, when said input means accepts dataabout a predetermined compound selected from a plurality of compoundsconstituting a predetermined reaction path, preparing said canonicaldata uniquely indicating a chemical structure of said predeterminedcompound from the data about the predetermined compound, furthersearching said compound information file, based on the canonical data,and reading out a compound number corresponding to said canonical datawhen said canonical data exists in said compound information file; in asixth step, reading, based on the compound number read out in said fifthstep, an enzyme number of an enzyme with the predetermined compoundbeing a substrate and an enzyme number of an enzyme with thepredetermined compound being a product out of said relation informationfile; in a seventh step, reading, based on each enzyme number read outin said sixth step, a compound number of a compound being a substratefor said enzyme and a compound number of a compound being a product bysaid enzyme out of said enzyme information file; in an eighth step,repeating said sixth step and said seventh step to obtain compounds andenzymes within the predetermined reaction path and; in a ninth step,indicating from enzyme numbers read out in said sixth step and compoundnumbers read out in said seventh step a reaction scheme diagram of saidplurality of compounds along the predetermined reaction path on saiddisplay means.
 13. The biochemical information processing methodaccording to claim 10, wherein said input means accepts input ofcharacteristic data about each of the atoms constituting a compound andbonding pair data between the atoms, and wherein said biochemicalinformation processing method further comprises: a constituent atomclassification step for classifying, based on the characteristic orbonding pair data accepted through said input means, the atoms intodifferent classes each for equivalent atoms and assigning, to each atom,a different class number for each class; a canonical number assignmentstep for assigning canonical numbers uniquely corresponding to thestructure of said compound to the respective atoms, based on the classnumbers assigned to the respective atoms in said constituent atomclassification step; and a canonical data preparation step for preparingsaid canonical data enabling to uniquely specify a chemical structure ofsaid compound, based on the canonical numbers assigned to the respectiveatoms in said canonical number assignment step.
 14. The biochemicalinformation processing method according to claim 13, wherein saidconstituent atom classification step assigns three types of attributes(a_(i), b_(ig), d_(ij)) to each atom and, utilizing the fact that atomsdifferent in even only one of these attributes can be determined to benot equivalent, assigns a different class number for each equivalentatom to each atom, where among said three types of attributes (a_(i),b_(ij), d_(ij)), a_(i) is a kind number of an atom of input number i,b_(ij) is the number of bonds adjoining the atom of input number i andhaving a bond kind number being j, and d_(ij) is the number of routesthat can be traced from the atom of input number i through j bonds inthe shortest path, wherein said canonical number assignment step isarranged so that when in a process for assigning a canonical number toeach atom in the ascending order from 1 the canonical number 1 is givento an atom with a highest priority of said class number and thereaftercanonical numbers up to the canonical number n are assigned, saidcanonical number assignment step selects an atom with a minimumcanonical number out of atoms already having their respective canonicalnumbers and bonding to an atom having no canonical number yet and thengives a canonical number n+1 to an atom with a highest priority of saidclass number out of atoms bonding to said selected atom and having nocanonical number yet, and wherein said canonical data preparation stepgives three types of attributes (P_(i), T_(i), S_(i)) to each atom andaligns these attributes in line to prepare said canonical data, whereamong said three types of attributes (P_(i), T_(i), S_(i)), P_(i) is acanonical number of an atom bonding to an atom of canonical number i andhaving a minimum canonical number, T_(i) is a symbol for a type of abond between the atom of canonical number i and the atom of canonicalnumber P_(i), and S_(i) is a symbol for a kind of the atom of canonicalnumber i.
 15. A biochemical information processing method, comprising:providing an information processing apparatus having input means foraccepting input of image data indicating said biochemical information orsymbolic data indicating said biochemical information, display means forindicating a reaction scheme diagram of a chemical reaction scheme, andstorage means for storing biochemical information about compounds andenzymes including a compound information file storing a list showing therelation between compound numbers of compounds and canonical datacorresponding to said compounds, and additional information about saidcompounds, an enzyme information file storing a list showing therelation among enzyme numbers of enzymes, compound numbers of compoundsbeing substrates for said enzymes, and compound numbers of compoundsbeing products by said enzymes, and additional information about saidenzymes, and a relation information file storing a list showing therelation among compound numbers of compounds as a key, enzyme numbers ofenzymes with a predetermined compound being a substrate, and enzymenumbers of enzymes with said predetermined compound being a product; ina first step, when said input means accepts data about the predeterminedcompound selected from a plurality of compounds constituting apredetermined reaction path, preparing said canonical data uniquelyindicating a chemical structure of said predetermined compound from thedata, further searching said compound information file, based on thecanonical data, and reading out a compound number corresponding to saidcanonical data when said canonical data exists in said compoundinformation file; in a second step, reading, based on the compoundnumber read out in said first step, an enzyme number of an enzyme withthe predetermined compound being a substrate and an enzyme number of anenzyme with the predetermined compound being a product out of saidrelation information file; in a third step, reading, based on eachenzyme number read out in said second step, a compound number of acompound being a substrate for said enzyme and a compound number of acompound being a product by said enzyme out of said enzyme informationfile; in a fourth step, repeating said second step and said third stepto obtain compounds and enzymes within the predetermined reaction path;and in a fifth step, indicating from enzyme numbers read out in saidsecond step and compound numbers read out in said third step a reactionscheme diagram of said plurality of compounds along the predeterminedreaction path on said display means.
 16. The biochemical informationprocessing method according to claim 15, further comprising: furtherproviding in said storage means stored biochemical information aboutreceptors, and a receptor information file storing a list showing arelation between receptor numbers of receptors and compound numbers ofcompounds being agonists and/or antagonists for said receptors, andadditional information about said receptors, said relation informationfile storing a list to show the relation among the compound numbers ofthe compounds as a key, the enzyme numbers of the enzymes with saidpredetermined compound being a substrate, the enzyme numbers of theenzymes with said predetermined compound being a product, the receptornumbers of the receptors with said predetermined compound being anagonist, and the receptor numbers of the receptors with saidpredetermined compound being an antagonist; in a sixth step, when saidinput means accepts data about the predetermined compound, preparingsaid canonical data uniquely indicating a chemical structure of saidpredetermined compound from the data about the predetermined compound,further searching said compound information file, based on saidcanonical data, and reading out a compound number corresponding to saidcanonical data when said canonical data exists in said compoundinformation file; in a seventh step, reading, based on the compoundnumber read out in said sixth step, a receptor number of a receptor withthe predetermined compound being an agonist or an antagonist out of saidrelation information file; in an eighth step, reading at leastadditional information about the receptor of the receptor number readout in said seventh step out of said receptor information file; and in aninth step, indicating at least the additional information about thereceptor read out in said eighth step on said display means.
 17. Thebiochemical information processing method according to claim 15, whereinsaid input means accepts input of characteristic data about each of theatoms constituting a compound and bonding pair data between the atoms;and wherein said biochemical information processing method furthercomprises: a constituent atom classification step for classifying, basedon the characteristic or bonding pair data accepted through said inputmeans, the atoms into different classes each for equivalent atoms andassigning, to each atom, a different class number for each class, acanonical number assignment step for assigning canonical numbersuniquely corresponding to the structure of said compound to therespective atoms, based on the class numbers assigned to the respectiveatoms in said constituent atom classification step, and a canonical datapreparation step for preparing said canonical data enabling to uniquelyspecify a chemical structure of said compound, based on the canonicalnumbers assigned to the respective atoms in said canonical numberassignment step.
 18. The biochemical information processing methodaccording to claim 17, wherein said constituent atom classification stepassigns three types of attributes (a_(i), b_(ij), d_(ij)) to each atomand, utilizing the fact that atoms different in even only one of theseattributes can be determined to be not equivalent, assigns a differentclass number for each equivalent atom to each atom, where among saidthree types of attributes (a_(i), b_(ij), d_(ij)), a_(i) is a kindnumber of an atom of input number i, b_(ij) is the number of bondsadjoining the atom of input number i and having a bond kind number beingj, and d_(ij) is the number of routes that can be traced from the atomof input number i through j bonds in the shortest path, wherein saidcanonical number assignment step is arranged so that when in a processfor assigning a canonical number to each atom in the ascending orderfrom 1 the canonical number 1 is given to an atom with a highestpriority of said class number and thereafter canonical numbers up to thecanonical number n are assigned, said canonical number assignment stepselects an atom with a minimum canonical number out of atoms alreadyhaving their respective canonical numbers and bonding to an atom havingno canonical number yet and then gives a canonical number n+1 to an atomwith a highest priority of said class number out of atoms bonding to aselected atom and having no canonical number yet, and wherein saidcanonical data preparation step gives three types of attributes (P_(i),T_(i), S_(i)) to each atom and aligns these attributes in line toprepare said canonical data, where among said three types of attributes(P_(i), T_(i), S_(i)), P_(i) is a canonical number of an atom bonding toan atom of canonical number i and having a minimum canonical number,T_(i) is a symbol for a type of a bond between the atom of canonicalnumber i and the atom of canonical number P_(i), and S_(i) is a symbolfor a kind of the atom of canonical number i.
 19. A biochemicalinformation computer program product used with an information processingapparatus having input means for accepting input of image dataindicating biochemical information or symbolic data indicatingbiochemical information, display means for indicating a reaction schemediagram of a chemical reaction scheme, and reading means for readinginformation out of a computer-usable medium, said computer programproduct comprising: computer-usable medium, said computer-usable mediumhaving a file area for recording a file and a program area for recordinga program and having computer-readable file and program embodied in saidcomputer-usable medium, for letting a reaction scheme diagram besearched for and be indicated by said display means, based on data inputthrough said input means, said computer program product having, in saidfile area, a computer-readable compound information file for storing alist showing the relation between compound numbers of compounds andcanonical data corresponding to said compounds, and additionalinformation about said compounds, a computer-readable enzyme informationfile for storing a list showing the relation among enzyme numbers ofenzymes, compound numbers of compounds being substrates for saidenzymes, and compound numbers of compounds being products by saidenzymes, and additional information about said enzymes, and acomputer-readable relation information file for storing a list showingthe relation among compound numbers of compounds as a key, enzymenumbers of enzymes with a compound being a substrate, and enzyme numbersof enzymes with said compound being a product, and having, in saidprogram area, a computer-readable reaction scheme detection program for,when said input means accepts data about the compound being a substrateand/or a product, detecting a chemical reaction scheme involving saidcompound, based on the data characterizing the compound as at least oneof a substrate and a product; wherein said reaction scheme detectionprogram comprises: a first computer-readable process routine forpreparing from the data about a compound accepted through said inputmeans said canonical data uniquely indicating a chemical structure ofsaid compound, further searching said compound information file, basedon the canonical data, and reading out a compound number correspondingto said canonical data when said canonical data exists in said compoundinformation file, a second computer-readable process routine for readingan enzyme number of an enzyme with the compound being a substrate or aproduct out of said relation information file, based on the compoundnumber read out in said first computer-readable process routine, a thirdcomputer-readable process routine for reading a compound number ofanother compound constituting a reaction system together with the enzymeof the enzyme number read out in said second computer-readable processroutine and said compound, and additional information about said enzymeout of said enzyme information file, and a fourth computer-readableprocess routine for indicating a reaction scheme diagram of the compoundwhose image or symbolic data was accepted through said input means onsaid display means from the compound number read out in said firstcomputer-readable process routine, the enzyme number read out in saidsecond computer-readable process routine, and the compound number of theanother compound read out in said third computer-readable processroutine, and further indicating the additional information about theenzyme read out in said third computer-readable process routine on saiddisplay means.
 20. The biochemical information computer program productaccording to claim 19, said computer program product further having, insaid file area, a computer-readable receptor information file storing alist showing the relation between receptor numbers of receptors andcompound numbers of compounds being agonists and/or antagonists for saidreceptors, and additional information about said receptors, wherein saidrelation information file stores a list to show the relation among thecompound numbers of the compounds as a key, the enzyme numbers of theenzymes with said compound being a substrate, the enzyme numbers of theenzymes with said compound being a product, the receptor numbers of thereceptors with said compound being an agonist, and the receptor numbersof the receptors with said compound being an antagonist, and saidcomputer program product further having, in said program area, acomputer-readable receptor information detection program for, when saidinput means accepts data about a compound, detecting additionalinformation about a receptor with said compound being an agonist and/oran antagonist, based on the data; wherein said receptor informationdetection program comprises: a fifth computer-readable process routinefor preparing from data about the compound accepted through said inputmeans said canonical data uniquely indicating a chemical structure ofsaid compound, further searching said compound information file, basedon said canonical data, and reading out a compound number correspondingto said canonical data when said canonical data exists in said compoundinformation file, a sixth computer-readable process routine for reading,based on the compound number read out in said fifth process routine, areceptor number of a receptor with the compound being an agonist or anantagonist out of said relation information file, a seventhcomputer-readable process routine for reading at least additionalinformation about the receptor of the receptor number read out in saidsixth process routine out of said receptor information file, and aneighth computer-readable process routine for indicating at least theadditional information about the receptor read out in said seventhprocess routine on said display means.
 21. The biochemical informationcomputer program product according to claim 19, said computer programproduct further having, in said program area, a computer-readablereaction path detection program for, when said input means accepts dataabout a predetermined compound selected from a plurality of compoundsconstituting a predetermined reaction path, detecting the predeterminedreaction path of said plurality of compounds, wherein said reaction pathdetection program comprises: a fifth computer-readable process routinefor preparing from the data about the predetermined compound acceptedthrough said input means said canonical data uniquely indicating achemical structure of said predetermined compound, further searchingsaid compound information file, based on the canonical data, and readingout a compound number corresponding to said canonical data when saidcanonical data exists in said compound information file, a sixthcomputer-readable process routine for reading, based on the compoundnumber read out in said fifth computer-readable process routine, anenzyme number of an enzyme with the predetermined compound being asubstrate and an enzyme number of an enzyme with the predeterminedcompound being a product out of said relation information file, aseventh computer-readable process routine for reading, based on eachenzyme number read out in said sixth computer-readable process routine,a compound number of a compound being a substrate for said enzyme and acompound number of a compound being a product by said enzyme out of saidenzyme information file, an eighth computer-readable process routine forrepeating said sixth computer-readable process routine and said seventhcomputer-readable process routine to obtain compounds and enzymes withinthe predetermined reaction path, and a ninth computer-readable processroutine for indicating from enzyme numbers read out in said sixthcomputer-readable process routine and compound numbers read out in saidseventh computer-readable process routine a reaction scheme diagram ofsaid plurality of compounds along the predetermined reaction path onsaid display means.
 22. The biochemical information computer programproduct according to claim 19, wherein said input means accepts input ofcharacteristic data about each of atoms constituting a compound andbonding pair data between atoms, wherein said computer program productfurther has, in said program area, a computer-readable canonical datapreparation program for preparing canonical data to uniquely specify achemical structure of said compound, based on the characteristic orbonding pair data accepted through said input means; and wherein saidcanonical data preparation program comprises: a computer-readableconstituent atom classification routine for classifying the atoms intodifferent classes each for equivalent atoms and assigning, to each atom,a different class number for each class, a computer-readable canonicalnumber assignment routine for assigning canonical numbers uniquelycorresponding to the structure of said compound to the respective atoms,based on the class numbers assigned to the respective atoms in saidconstituent atom classification routine, and a computer-readablecanonical data preparation routine for preparing said canonical data,based on the canonical numbers assigned to the respective atoms in saidcanonical number assignment routine.
 23. The biochemical informationcomputer program product according to claim 22, wherein said constituentatom classification routine assigns three types of attributes (a_(i),b_(ij), d_(ij)) to each atom and, utilizing the fact that atomsdifferent in even only one of these attributes can be determined to benot equivalent, assigns a different class number for each equivalentatom to each atom, where among said three types of attributes (a_(i),b_(ij), d_(ij)), a_(i) is a kind number of an atom of input number i,b_(ij) is the number of bonds adjoining the atom of input number i andhaving a bond kind number being j, and d_(ij) is the number of routesthat can be traced from the atom of input number i through j bonds inthe shortest path, wherein said canonical number assignment routine isarranged so that when in a process for assigning a canonical number toeach atom in the ascending order from 1 the canonical number 1 is givento an atom with a highest priority of said class number and thereaftercanonical numbers up to the canonical number n are assigned, saidcanonical number assignment routine selects an atom with a minimumcanonical number out of atoms already having their respective canonicalnumbers and bonding to an atom having no canonical number yet and thengives a canonical number n+1 to an atom with a highest priority of saidclass number out of atoms bonding to a selected atom and having nocanonical number yet, and wherein said canonical data preparationroutine gives three types of attributes (P_(i), T_(i), S_(i)) to eachatom and aligns these attributes in line to prepare said canonical data,where among said three types of attributes (P_(i), T_(i), S_(i)), P_(i)is a canonical number of an atom bonding to an atom of canonical numberi and having a minimum canonical number, T_(i) is a symbol for a type ofa bond between the atom of canonical number i and the atom of canonicalnumber P_(i), and S_(i) is a symbol for a kind of the atom of canonicalnumber i.
 24. The biochemical information computer program productaccording to claim 19, wherein said computer-usable medium is a disktype recording medium or a tape type recording medium.
 25. A biochemicalinformation computer program product used with an information processingapparatus having input means for accepting input of image dataindicating biochemical information or symbolic data indicatingbiochemical information, display means for indicating a reaction schemediagram of a chemical reaction scheme, and reading means for readinginformation out of a computer-usable medium, said computer programproduct comprising: the computer-usable medium, said computer-usablemedium having a file area for recording a file and a program area forrecording a program and having computer-readable file and programembodied in said computer-usable medium, for letting the reaction schemediagram be searched for and be indicated by said display means, based ondata input through said input means; said computer program producthaving, in said file area, a computer-readable compound information filefor storing a list showing the relation between compound numbers ofcompounds and canonical data corresponding to said compounds, andadditional information about said compounds, a computer-readable enzymeinformation file for storing a list showing the relation among enzymenumbers of enzymes, compound numbers of compounds being substrates forsaid enzymes, and compound numbers of compounds being products by saidenzymes, and additional information about said enzymes, and acomputer-readable relation information file for storing a list showingthe relation among compound numbers of compounds as a key, enzymenumbers of enzymes with a predetermined compound being a substrate, andenzyme numbers of enzymes with said compound being a product, andhaving, in said program area, a computer-readable reaction pathdetection program for, when said input means accepts data about apredetermined compound selected from a plurality of compoundsconstituting a predetermined reaction path, detecting the predeterminedreaction path of said plurality of compounds, based on the data aboutthe predetermined compound; wherein said reaction path detection programcomprises: a first computer-readable process routine for preparing fromthe data about the predetermined compound accepted through said inputmeans said canonical number data uniquely indicating a chemicalstructure of said predetermined compound, further searching saidcompound information file, based on the canonical data, and reading outa compound number corresponding to said canonical data when saidcanonical data exists in said compound information file, a secondcomputer-readable process routine for reading, based on the compoundnumber read out in said first computer-readable process routine, anenzyme number of an enzyme with the predetermined compound being asubstrate and an enzyme number of an enzyme with the predeterminedcompound being a product out of said relation information file, a thirdcomputer-readable process routine for reading, based on each enzymenumber read out in said second computer-readable process routine, acompound number of a compound being a substrate for said enzyme and acompound number of a compound being a product by said enzyme out of saidenzyme information file, a fourth computer-readable process routine forrepeating said second computer-readable process routine and said thirdcomputer-readable process routine to obtain compounds and enzymes withinthe predetermined reaction path, and a fifth computer-readable processroutine for indicating from enzyme numbers read out in said secondcomputer-readable process routine and compound numbers read out in saidthird computer-readable process routine a reaction scheme diagram ofsaid plurality of compounds along the predetermined reaction path onsaid display means.
 26. The biochemical information computer programproduct according to claim 25, said computer program product furtherhaving, in said file area, a computer-readable receptor information filestoring a list showing the relation between receptor numbers ofreceptors and compound numbers of compounds being agonists and/orantagonists for said receptors, and additional information about saidreceptors, wherein said relation information file stores a list to showthe relation among the compound numbers of the compounds as a key, theenzyme numbers of the enzymes with said compound being a substrate, theenzyme numbers of the enzymes with said compound being a product, thereceptor numbers of the receptors with said compound being an agonist,and the receptor numbers of the receptors with said compound being anantagonist, and said computer program product further having, in saidprogram area, a computer-readable receptor information detection programfor, when said input means accepts data about a compound, detectingadditional information about a receptor with said compound being anagonist and/or an antagonist based on the data about the compound;wherein said receptor information detection program comprises: a sixthcomputer-readable process routine for preparing from data about thecompound accepted through said input means said canonical data uniquelyindicating a chemical structure of said compound, further searching saidcompound information file, based on said canonical data, and reading outa compound number corresponding to said canonical number data when saidcanonical data exists in said compound information file, a seventhcomputer-readable process routine for reading, based on the compoundnumber read out in said fifth computer-readable process routine, areceptor number of a receptor with the compound being an agonist or anantagonist out of said relation information file, an eighthcomputer-readable process routine for reading at least additionalinformation about the receptor of the receptor number read out in saidseventh computer-readable process routine out of said receptorinformation file, and a ninth computer-readable process routine forindicating at least the additional information about the receptor readout in said eighth computer-readable process routine on said displaymeans.
 27. The biochemical information computer program productaccording to claim 25, wherein said input means accepts input ofcharacteristic data about each of the atoms constituting a compound andbonding pair data between the atoms, wherein said computer programproduct further has, in said program area, a computer-readable canonicaldata preparation program for preparing canonical data to uniquelyspecify a chemical structure of said compound, based on thecharacteristic or bonding pair data accepted through said input means;and wherein said canonical data preparation program comprises: acomputer-readable constituent atom classification routine forclassifying the atoms into different classes each for equivalent atomsand assigning, to each atom, a different class number for each class, acomputer-readable canonical number assignment routine for assigningcanonical numbers uniquely corresponding to the structure of saidcompound to the respective atoms, based on the class numbers assigned tothe respective atoms in said constituent atom classification routine,and a computer-readable canonical data preparation routine for preparingsaid canonical data, based on the canonical numbers assigned to therespective atoms in said canonical number assignment routine.
 28. Thebiochemical information computer program product according to claim 27,wherein said constituent atom classification routine assigns three typesof attributes (a_(i), b_(ij), d_(ij)) to each atom and, utilizing thefact that atoms different in even only one of these attributes can bedetermined to be not equivalent, assigns a different class number foreach equivalent atom to each atom, where among said three types ofattributes (a_(i), b_(ij), d_(ij)), a_(i) is a kind number of an atom ofinput number i, b_(ij) is the number of bonds adjoining the atom ofinput number i and having a bond kind number being j, and d_(ij) is thenumber of routes that can be traced from the atom of input number ithrough j bonds in the shortest path; wherein said canonical numberassignment routine is arranged so that when in a process for assigning acanonical number to each atom in the ascending order from 1 thecanonical number 1 is given to an atom with a highest priority of saidclass number and thereafter canonical numbers up to the canonical numbern are assigned in that manner, said canonical number assignment routineselects an atom with a minimum canonical number out of atoms alreadyhaving their respective canonical numbers and bonding to an atom havingno canonical number yet and then gives a canonical number n+1 to an atomwith a highest priority of said class number out of atoms bonding to aselected atom and having no canonical number yet; and wherein saidcanonical data preparation routine gives three types of attributes(P_(i), T_(i), S_(i)) to each atom and aligns these attributes in lineto prepare said canonical data, where among said three types ofattributes (P_(i), T_(i), S_(i)), P_(i) is a canonical number of an atombonding to an atom of canonical number i and having a minimum canonicalnumber, T_(i) is a symbol for a type of a bond between the atom ofcanonical number i and the atom of canonical number P_(i), and S_(i) isa symbol for a kind of the atom of canonical number i.
 29. Thebiochemical information computer program product according to claim 25,wherein said computer-usable medium is a disk type recording medium or atape type recording medium.